Looking for a free / open source OCR library for Java, ideally written
in pure Java. Or at least an open source OCR library with a Java
interface.
I've seen the two commercial ones, Asprise and JavaOCR, not interested.
A lot of open source folks link to GOCR, but that is written in C/C++.
Rumor has it there might be an OCR bridge to it, but I haven't found
it yet. |
Request for Question Clarification by
endo-ga
on
26 Dec 2004 08:13 PST
Hi,
I was looking for the same thing you were, and I read about the JNI
for GOCR, however I've never found it.
What I ended up doing was using a modified GOCR executable (that I
recompiled in Visual Studio .NET for speed). The main difference is
that it appends to the output file instead of overwriting. I then
accessed the executable through the Java process class. It worked
pretty well and it was quite fast.
At the moment I am away from the computer that has all the code for
this project, however in a couple of weeks, I would be able to give
you more details. Would such a solution be a satifactory answer?
Thanks.
endo
|
Clarification of Question by
ttennebkram-ga
on
26 Dec 2004 13:22 PST
Endo,
I'm not sure, maybe...
I don't know much about .net, or using Java with .net. I guess I
would need a .net framework? Not sure if that's free or not, or what
the footprint is, or how compiling and assembling the entire project
would work.
I'm trying to build a lightweight app that does a relatively small
amount of actual OCR, but that can be easily distributed in one .jar
file, or maybe .zip - can this .net solution be packaged like that?
I had actually wanted to hand images "in memory" to a library, and get
back a text buffer "in memory", but using temp files or sockets might
be ok.
Your solution does sound closer than what I had found previously,
maybe we can talk some more?
|
Request for Question Clarification by
endo-ga
on
27 Dec 2004 05:45 PST
Hi,
You wouldn't need anything extra. I was just saying that I had
recompiled the GOCR source for speed, and that now it's faster on
Windows 2k/XP. The Java/GOCR interaction I used, was just launching
the executable via Java, and using a file to transfer the data between
Java and GOCR.
I know what you mean about transferring the data "in memory" but that
would require writing the JNI interface. But since I've modified it to
append to file instead of overwrite, it works pretty well, the way I
did it.
For packaging matters, you can just include the .exe and .jar with
your zip, extract everything from the zip and run as a separate .exe
and .jar file.
I can get you more practical details about the Java/GOCR interaction
around 11th of January, if that's ok. I will also have access to the
recompiled GOCR executable.
Thanks.
endo
|
Request for Question Clarification by
endo-ga
on
20 Jan 2005 12:36 PST
Hi,
Are you still interested in the solution I can provide?
Thanks.
endo
|