Google Answers Logo
View Question
 
Q: Text reports converter (techtor, feilong, cipher17, lot) ( No Answer,   6 Comments )
Question  
Subject: Text reports converter (techtor, feilong, cipher17, lot)
Category: Computers > Software
Asked by: octopia-ga
List Price: $10.00
Posted: 11 Jan 2003 20:33 PST
Expires: 10 Feb 2003 20:33 PST
Question ID: 141723
Hi everyone! 

I am looking for some solution providers who can build & customize a
software that can take a (scanned) text report and transform it into a
data sheet or table. So far I found this source, but they are somewhat
expensive (http://www.kingbusinesssolutions.com/businessTools.htm).

Request for Question Clarification by feilong-ga on 12 Jan 2003 02:59 PST
Hello Octopia,

If you want scanned text to be converted into a data sheet or table,
why not do it yourself and save money? You can transform scanned text
into editable text by using an OCR program. Omnipage Pro and
TextBridge are two sofwares that can give you the result you want. You
may want to try these first and see if it suits your need. Corel
Graphics Suite also have an OCR program. If my suggestions help,
please do tell me. Thanks for being a loyal Google Answers user.

Feilong

Clarification of Question by octopia-ga on 12 Jan 2003 03:59 PST
Hi Feilong.

That's a good suggestion and I am already doing that with simple text.
The tool that I am looking for is more is not for text-recognition.
It’s for identifying and organizing different values of text.

If you have a look at the website provided, I am sure you'll
understand. (Try to see the "before" and "after" results in the 
"One-Click Text Report Convertor" Demo).

Here is a simple example, though: If you have a report with
information on lots of companies. The different values of each
company– For example: company name, telephone, fax, contact person -
are not listed in an organized table. Rather, you have a continuous
flow that goes over three newspaper-like columns, and from one page to
the other.  In this case, the customized application that I am looking
for reads any value that comes after the word "Company:", and
recognizes that whatever comes after than is the company name and so
forth. Then, places all value under the relevant cell in an Excel or
text file. When done, it will give me the same content but organized
in one table.

Hope that helps to clarify the point, and thanks a lot for your prompt
reply.

Request for Question Clarification by lot-ga on 12 Jan 2003 07:48 PST
Hello octopia-ga
I understand what you are saying, so ultimately you are seeking to
create a CSV file from the scanned input which imports into your
database.
I don't have a solution off the top of my head, but just to let you
know that I've acknowledged your request.
kind regards
lot-ga

Clarification of Question by octopia-ga on 12 Jan 2003 11:22 PST
Good idea, rac! I will post below an example of the text before and
after processing. Over the next couple of clarifications, I am going
to post text before and after processing.

Right now, I am facing problems with the contact names because they
contain (1) title, (2) person name, & (3) sometimes the email address
of that person. Unforturnately these three are not separated by key
words to identify them. But if worse comes to worse, it might be a
case of manually processing this field only. Anyways, here is the
whole thing....

Clarification of Question by octopia-ga on 12 Jan 2003 11:23 PST
Before: It looks this...
-------------------------------------------------------------
Abba Electronics LLC
PO Box 327 Dubai United Arab Emirates
Tel: +971 (4)3371800 Fax: +971 (4) 336 4283 E-mail: abba (c)emirates,
net. ae Website: www.alabbas.com
KEY PERSONNEL
Chairman 	 Mr Ibrahim Al Abbas
General Manager 	 Mr Abdul Quadir (E-mail: abdul.quadir@abbauae.com)
Chief Accountant 	 Mr Uday Poojary (E-mail: uday.poojary@abbauae.com)
Sales Manager 	 Mr Prakash Kelkar (E-mail: prakash.kelkar@abbauae.com)
Operations Manager 	 Mr Dinesh Shetty (E-mail:
dinesh.shetty@abbauae.com)
Human Resources Manager 	 Mr John Tarian (E-mail:
john.tarian@abbauae.com)
LOCAL STATISTICS
No of employees: Under 100 Sales Volume: US$ 0-5mn Company locally
established: 1 979 company locally established: 1 994
BUSINESS ACTIVITY
Abba Electronics (UAE) provides business systems, system integration
and channel distribution. Distributors for Data Card, Pitney Bowes,
NEC and Casio. Member of Al Abbas Group (UAE).
INDUSTRY CLASSIFICATION
Consumer Goods, Electronics/Electrical, Services
NATIONALITY /TRADE AFFILIATION
United Arab Emirates, Japan, USA
DISTRIBUTOR FOR
Pitney Bowes, USA
DISTRIBUTOR FOR
NEC Corporation, Japan
DISTRIBUTOR FOR
Casio Computer Co Ltd, Japan
Abbar & Zainy Cold Stores
PO Box 6066 Sharjah United Arab Emirates
Tel: +971 (6) 533 2793 Fax: +971 (6) 533 0099 E-mail:
abzashj@emirates.net.ae
KEY PERSONNEL
Managing Director 	 Mr Camilo Venegas
General Manager 	 Mr Mazen Barakat
LOCAL STATISTICS
No of employees: Under 100
BUSINESS ACTIVITY
Abbar & Zainy Cold Stores (UAE) trades in food products and fruits.
Agent for Dovex Export Co (USA).
INDUSTRY CLASSIFICATION
Agriculture/Environmental, Food & Drink
NATIONALITY /TRADE AFFILIATION
United Arab Emirates, USA
Abbott Laboratories Regional Office
PO Box 32002                                                          
                                                                      
                   1 Dubai                                            
                                                                      
                                    4 United Arab Emirates
Tel: +971 (4)221 2711                                                 
                                                                      
                ( Fax: +971 (4)2231926                                
                                                                      
                                J Website: www.abbott.com
KEY PERSONNEL
General Manager 	 Mr Elie Abdelkarim
LOCAL STATISTICS
No of employees: Under 1 00
BUSINESS ACTIVITY
Abbott (USA) specialises in five broad areas, including pharmaceutical
products, chemical and agricultural products, nutritional products,
diagnostic products and hospital products. Abbott Laboratories employs
60,000 people globally and                         j markets its
products in more than 1 30 countries. All Abbott's regional operations
throughout Europe, Asia and the Americas are managed from the head
office.
INDUSTRY CLASSIFICATION
Pharmaceuticals/Medical
NATIONALITY /TRADE AFFILIATION
USA
SUBSIDIARY OF
Abbott Laboratories, USA
Abbott Laboratories SA
PO Box 535 18                                                         
                                                                      
                     | 403 API World Tower Sheikh Zayed Road Dubai    
                                                                      
                                                                      
 ( United Arab Emirates
Tel: +971 (4) 332 7862 Fax: +971 (4) 332 7904                         
                                                                      
                                       ( Website: www.abbott.com
KEY PERSONNEL
Country Manager 	 Mr Aslam Hawa
Finance Manager 	 Mr Syed Zakiuddin
Marketing and Sales Manager 	 Mr Kinjal Zaveri
LOCAL STATISTICS
No of employees: Under 100 Regional Head Office: Middle East Company
locally established: 1 981
BUSINESS ACTIVITY
Abbott Laboratories (United Arab Emirates) is the diagnostics division
of Abbott Laboratories (USA).
INDUSTRY CLASSIFICATION
Pharmaceuticals/Medical
NATIONALITY / TRADE AFFILIATION USA
SUBSIDIARY OF
Abbott Laboratories, USA
Abdul Jalil Crocker Partnership, The
PO Box 9259 UAE Enterprises Building Airport Road Dubai United Arab
Emirates
Tel: +971 (4) 286 8687 Fax: +971 (4) 286 8646 E-mail: ajcpdxb
(c)emirates, net. ae
KEY PERSONNEL
Chief Executive Officer 	 	 	 Mr Stuart Johnson
LOCAL STATISTICS
No of employees Under 100 Annual Sales Volume US$20-40mn Regional Head
Office Middle East Company locally established 1 976
BUSINESS ACTIVITY
Abdul Jalil Crocker (UAE) provides structural and civil engineering
consultancy.
INDUSTRY CLASSIFICATION
Construction/Engineering, Consultancy
NATIONALITY /TRADE AFFILIATION
United Arab Emirates, United Kingdom
SUBSIDIARY OF
RJ Crocker & Partners, United Kingdom
Abduli & Baker Co
PO Box 2745 Tariq Bin Zayed Road Sharjah United Arab Emirates
Tel: +971 (6)5625810 Fax: +971 (6)5616625 E-mail: ahujad@emi rates,
net.ae
KEY PERSONNEL
Managing Director 	 Mr Naresh Ahuja
Marketing/Sales Director 	 Mr Raju Thomas
Director 	 Mr Dilip Ahuja
Accountant 	 Mr Naresh Ahuja
LOCAL STATISTICS
No of employees: Under 1 00 Annual Sales Volume: US$ 0-5mn Company
locally established: 1975
BUSINESS ACTIVITY
Import, export, wholesale and retail of electrical consumer products.
Indian/UAE joint venture. Annual sales volume in the UAE is under
US$5mn.
INDUSTRY CLASSIFICATION
Trade, Electronics/Electrical, Retail
NATIONALITY /TRADE AFFILIATION
India, United Arab Emirates
ABN Amro Bank NV
Khaleed BinWhaleed Road PO Box 2567 Bur Dubai United Arab Emirates
Tel: +971 (4) 351 2200 Fax: +971 (4)351 1555 E-mail: aabdubai(r)
emirates, net.ae Website: www.abnamro-uae.com
KEY PERSONNEL
Country Representative 	 Mr Tom Zwaan
Deputy Country Manager 	 Mr Domzwaan
Financial Controller 	 Mr Suresh Sampat Kumar
Director, Global Trade and Advisory, EMEA 	 Mr Daniel Cotti
Director, Corporate Cash Management, EMEA 	 Mr Alan Verschoyle-King
Business Support Services Manager 	 Mr Narendra Gajria
IPB Manager 	 Mr Bala Krishnan
Operations Co-ordinator 	 Mr Amitabh Das
Treasurer 	 Mr Mir Waqas Ellahi
LOCAL STATISTICS
No of employees: 1 00+ Company locally established: 1973
BUSINESS ACTIVITY
ABN AMRO Bank (UAE) is a full service banking operation addressing
local, national and international business interests as well as
offering an extensive range of corporate, private banking, treasury
and retail services. Branch offices are located in Abu Dhabi and
Sharjah.
INDUSTRY CLASSIFICATION
Banking/Finance
NATIONALITY /TRADE AFFILIATION
Netherlands
REPRESENTATIVE OFFICE FOR
ABN AMRO Bank NV, Netherlands
ABN Amro Bank NV
PO Box 1971 Sharjah United Arab Emirates
Tel: +971 (6) 559 4900 Fax: +971 (6)5591009 E-mail: aabshj(r)
emirates, net.ae Website: www.abnamro-uae.com
KEY PERSONNEL
Branch Manager 	 Mr Rajeev Jain
Operations Manager 	 Mr Mahmood H Khan
LOCAL STATISTICS
No of employees: 500+
INDUSTRY CLASSIFICATION
Banking/Finance
NATIONALITY /TRADE AFFILIATION
Netherlands
SUBSIDIARY OF
ABN AMRO Bank NV, Netherlands
ABN Amro Bank NV
PO Box 2720 Abu Dhabi United Arab Emirates
Tel: +971 (2) 633 5400 Fax: +971 (2)6330182 E-mail: abnamro(c)
emirates. net.ae Website: www.abnamro-uae.com
KEY PERSONNEL
Corporate Manager 	 MrTanveer Islam
General Affairs Manager 	 Ms Rachel Duston
LOCAL STATISTICS
No of employees: Under 100 Company locally established: 1 975
INDUSTRY CLASSIFICATION
Banking/Finance
NATIONALITY /TRADE AFFILIATION
Netherlands
SUBSIDIARY OF
ABN AMRO Bank NV, Netherlands
Abu Dhabi Maritime & Mercantile International Co (ADMMI)
PO Box 247 Abu Dhabi United Arab Emirates
Tel +971(2)6273131 Fax +971 (2) 626 9661 E mail admmige n@ emirates
net ae Website www admmi com
KEY PERSONNEL
Partner                                                               
                      Mr R Al Masaood Finance Director                
                                                             Mr
Nagarajan General Manager                                             
                               Mr John Aves
LOCAL STATISTICS
No of employees Under 100 Company locally established 1 965
BUSINESS ACTIVITY
Industrial caterers, cold store operators, retailers, wholesalers of
foodstuff, travel, shipping and engineering services
INDUSTRY CLASSIFICATION
Food & Drink, Retail, Services, Transport, Tounsm/Travel/Leisure
NATIONALITY /TRADE AFFILIATION
United Kingdom, USA
ASSOCIATED WITH
Inchcape pic, United Kingdom

Clarification of Question by octopia-ga on 12 Jan 2003 11:25 PST
.... and after, here is the csv values. You can save this text in a
notepad file and try viewing it XL to see what I mean...
--------------------------------------------------------
Comapany Name	Address	City 	Country	Phone	Fax 	Email 	Website	KEY
PERSONNEL Title 1	KEY PERSONNEL Name 1	KEY PERSONNEL Email 1	KEY
PERSONNEL Title 2	KEY PERSONNEL Name 2	KEY PERSONNEL Email 2	KEY
PERSONNEL Title 3	KEY PERSONNEL Name 3	KEY PERSONNEL Email 3	KEY
PERSONNEL Title 4	KEY PERSONNEL Name 4	KEY PERSONNEL Email 4	KEY
PERSONNEL Title 5	KEY PERSONNEL Name 5	KEY PERSONNEL Email 5	KEY
PERSONNEL Title 6	KEY PERSONNEL Name 6	KEY PERSONNEL Email 6	KEY
PERSONNEL Title 7	KEY PERSONNEL Name 7	KEY PERSONNEL Email 7	KEY
PERSONNEL Title 8	KEY PERSONNEL Name 8	KEY PERSONNEL Email 8	KEY
PERSONNEL Title 9	KEY PERSONNEL Name 9	KEY PERSONNEL Email 9	KEY
PERSONNEL Title 10	KEY PERSONNEL Name 10	KEY PERSONNEL Email 10	No of
employees	Sales Volume	Regional Head Office	Company locally 	BUSINESS
ACTIVITY	INDUSTRY CLASSIFICATION	NATIONALITY /TRADE
AFFILIATION	Relationship 1	Related Company  1	Relationship 2	Related
Company  2	Relationship 3	Related Company  3	Relationship 4	Related
Company  4	Relationship 5	Related Company  5
Abba Electronics LLC	PO Box 327  	Dubai	United Arab Emirates	971
(4)3371800 	971 (4) 336 4283 	"abba ©emirates, net.
ae"	www.alabbas.com	Chairman	Mr Ibrahim Al Abbas		General Manager	Mr
Abdul Quadir	abdul.quadir@abbauae.com	Chief Accountant 	Mr Uday
Poojary	uday.poojary@abbauae.com	Sales Manager	Mr Prakash Kelkar
	prakash.kelkar@abbauae.com	Operations Manager	Mr Dinesh Shetty
	dinesh.shetty@abbauae.com	Human Resources Manager 	Mr John Tarian
	john.tarian@abbauae.com													Under 100 	US$ 0-5mn 		 1
994	"Abba Electronics (UAE) provides business systems, system
integration and channel distribution. Distributors for Data Card,
Pitney Bowes, NEC and Casio. Member of Al Abbas Group
(UAE)."	"Consumer Goods, Electronics/Electrical, Services"	"United
Arab Emirates, Japan, USA"	DISTRIBUTOR FOR	"Pitney Bowes,
USA"	DISTRIBUTOR FOR	"NEC Corporation, Japan"	DISTRIBUTOR FOR	"Casio
Computer Co Ltd, Japan"
Abbar & Zainy Cold Stores	PO Box 6066 Sharjah United Arab
Emirates			971 (6) 533 2793	971 (6) 533 0099
	abzashj@emirates.net.ae		Managing Director  	Mr Camilo
Venegas		General Manager  	Mr Mazen
Barakat																										Under 100 				Abbar & Zainy Cold
Stores (UAE) trades in food products and fruits. Agent for Dovex
Export Co (USA).	"Agriculture/Environmental, Food & Drink"	"United
Arab Emirates, USA"
Abbott Laboratories Regional Office	PO Box 32002                      
                                                                      
                                                       1 Dubai        
                                                                      
                                                                      
 4 United Arab Emirates			971 (4)221 2711	971 (4)2231926              
                                                                      
                                                 
		www.abbott.com	General Manager 	Mr Elie
Abdelkarim																													Under 100 				"Abbott (USA)
specialises in five broad areas, including pharmaceutical products,
chemical and agricultural products, nutritional products, diagnostic
products and hospital products. Abbott Laboratories employs 60,000
people globally and                         j markets its products in
more than 1 30 countries. All Abbott's regional operations throughout
Europe, Asia and the Americas are managed from the head
office."	Pharmaceuticals/Medical                                      
                                                               	USA   
                                                                      
                                                                      
       	SUBSIDIARY OF	"Abbott Laboratories, USA                       
                                                                      
            "
Abbott Laboratories SA	PO Box 535 18                                  
                                                                      
                                            | 403 API World Tower
Sheikh Zayed Road Dubai                                               
                                                                      
                             ( United Arab Emirates			971 (4) 332
7862	971 (4) 332 7904                                                 
                                                                      
               		www.abbott.com	Country Manager  	Mr Aslam
Hawa		Finance Manager  	Mr Syed Zakiuddin		Marketing and Sales Manager
 	Mr Kinjal Zaveri																							Under 100 		Middle East 	1
981	Abbott Laboratories (United Arab Emirates) is the diagnostics
division of Abbott Laboratories
(USA).	Pharmaceuticals/Medical		SUBSIDIARY OF	"Abbott Laboratories,
USA"
"Abdul Jalil Crocker Partnership, The"	PO Box 9259 UAE Enterprises
Building Airport Road 	Dubai 	United Arab Emirates	971 (4) 286
8687	971 (4) 286 8646 	"ajcpdxb ©emirates, net. ae"		Chief Executive
Officer  	Mr Stuart Johnson																													Under 100 			1
970	Abdul Jalil Crocker (UAE) provides structural and civil
engineering consultancy.	"Construction/Engineering,
Consultancy"	"United Arab Emirates, United Kingdom"	SUBSIDIARY OF	"RJ
Crocker & Partners, United Kingdom"
Abduli & Baker Co	PO Box 2745 Tariq Bin Zayed Road 	Sharjah 	United
Arab Emirates	971 (6)5625810	971 (6)5616625 	"ahujad@emi rates,
net.ae"		Managing Director  	Mr Naresh Ahuja		Marketing/Sales Director
	Mr Raju Thomas		Director  	Mr Dilip Ahuja		Accountant  	Mr Naresh
Ahuja																				Under 100 	US$20-40mn	Middle East 	1 976
	"Import, export, wholesale and retail of electrical consumer
products. Indian/UAE joint venture. Annual sales volume in the UAE is
under US$5mn."	"Trade, Electronics/Electrical, Retail"	"India, United
Arab Emirates"
ABN Amro Bank NV	Khaleed BinWhaleed Road PO Box 2567	Bur Dubai 	United
Arab Emirates	971 (4) 351 2200 	971 (4)351 1555 	"aabdubai® emirates,
net.ae "	www.abnamro-uae.com	Country Representative  	Mr Tom
Zwaan		Deputy Country Manager  	Mr Domzwaan		Financial Controller  	Mr
Suresh Sampat Kumar		"Director, Global Trade and Advisory, EMEA  "	Mr
Daniel Cotti		"Director, Corporate Cash Management, EMEA  "	Mr Alan
Verschoyle-King		Business Support Services Manager  	Mr Narendra
Gajria		IPB Manager   	Mr Bala Krishnan		Operations Co-ordinator   	Mr
Amitabh Da		Treasurer   	Mr Mir Waqas Ellahi					1 00+ 			1973	"ABN
AMRO Bank (UAE) is a full service banking operation addressing local,
national and international business interests as well as offering an
extensive range of corporate, private banking, treasury and retail
services. Branch offices are located in Abu Dhabi and
Sharjah."	Banking/Finance	Netherlands	REPRESENTATIVE OFFICE FOR	"ABN
AMRO Bank NV, Netherlands"
ABN Amro Bank NV	PO Box 1971 	Sharjah	United Arab Emirates	971 (6) 559
4900	971 (6)5591009 	"aabshj® emirates, net.ae
"	www.abnamro-uae.com	Branch Manager  	Mr Rajeev Jain		Operations
Manager  	Mr Mahmood H Khan																										Under 100
					Banking/Finance	Netherlands	SUBSIDIARY OF	"ABN AMRO Bank NV,
Netherlands"
ABN Amro Bank NV	PO Box 2720 	Abu Dhabi	United Arab Emirates	971 (2)
633 5400 	971 (2)6330182 	abnamro© emirates. net.ae
	www.abnamro-uae.com	Corporate Manager 	MrTanveer Islam		General
Affairs Manager 	Ms Rachel Duston																										Under 100
			1 975		Banking/Finance	Netherlands	SUBSIDIARY OF	"ABN AMRO Bank NV,
Netherlands"
Abu Dhabi Maritime & Mercantile International Co (ADMMI)	PO Box 247
	Abu Dhabi	United Arab Emirates	971(2)6273131 	971 (2) 626
9661	admmige n@ emirates net ae 	www admmi com	Partner	Mr R Al
Masaood		Finance Director	Mr Nagarajan		General Manager               
                                                             	Mr John
Aves																							Under 100 			1 965	"Industrial caterers,
cold store operators, retailers, wholesalers of foodstuff, travel,
shipping and engineering services"	"Food & Drink, Retail, Services,
Transport, Tounsm/Travel/Leisure"	"United Kingdom, USA"	ASSOCIATED
WITH	"Inchcape pic, United Kingdom"

Request for Question Clarification by techtor-ga on 12 Jan 2003 21:38 PST
Good to see you again, Octopia. How much is the program of King
Business Solutions? I was unable to get that from their website. I
have been doing my own search for text converter software, but nothing
free turned up yet. I've found another text converter that's $160 to
purchase. I also remember the $350 file converter software I suggested
to you in a previous question. That must be quite expensive, though.

Request for Question Clarification by studboy-ga on 12 Jan 2003 23:17 PST
Have you tried using a Perl script?

Clarification of Question by octopia-ga on 13 Jan 2003 02:32 PST
Hi there, techtor, good to see you again.

Well, King's is around $999! So, I am trying to find something better
than that. It was interesting to know that the $350 could do the job.
I will go back and try to check it out. For the other software you
have mentioned, that's ok. It’s just that I need confirmation that it
can intelligently recognize the different values of each in each
record in the reported pages. This task usually requires some
customization of the software, either at the customer’s or developer’s
end.

Thanks…

Clarification of Question by octopia-ga on 13 Jan 2003 02:38 PST
Hi studboy,

Thanks for your clarification request. 

Nope, I didn't try Perl Scrip. Please let me kow if it would help me,
and where can I find reasonbly-priced development services for that.

Clarification of Question by octopia-ga on 13 Jan 2003 02:43 PST
thanks for updating me, lot. If you come accross something that I can
use, please let me know.

Request for Question Clarification by techtor-ga on 13 Jan 2003 09:24 PST
At $999 you could get a good new computer. Wow. 
Unfortunately, the $350 program I mentioned earlier had a price jackup
recently to $470, although I believe the makers released a new
version. Just to refresh your memory:

Soft Interface - Convert-Doc
http://www.softinterface.com/Convert-File-Programs/Convert-File-Program.HTM

I do assume you want a program that preserves tabled organization in
the file conversion, like in the example of King Business Solution's
program.

I'll keep looking around.

Request for Question Clarification by studboy-ga on 13 Jan 2003 13:55 PST
Hi octopia-ga

Well, Perl is free.  Development wise you are looking at
simple regular expressions which you can probably code yourself
in about 10 minute or less.

Download Indigo Perl from here and install it.  I will walk
you through the rest.

http://www.indigostar.com/indigoperl.htm

Clarification of Question by octopia-ga on 13 Jan 2003 22:50 PST
Thanks, techtor, I went back and checked it out, but the requirements
now are kind of different. The key is not handling the file as a
whole, and converting it to another format. It’s reading the text
inside the file and processing it according to key words, and then
placing each value in the corresponding field. The solution should
allow you to define what key words in the sources you should look for,
and what to do with the text the comes after that.

The problem here is that the report above doesn’t come in an organized
table-like format, it’s just continuous flow of data. To make it any
good, it needs to be returned to a datasheet with the relevant value
from each listed record placed in the appropriate cell and
corresponding field.

It’s a tough one but if you have any ideas, please let me know.

Clarification of Question by octopia-ga on 13 Jan 2003 22:54 PST
Thank you very much, studboy, for offering to help. I think this might
be the best solution if  it woks out. I am excited about it and am now
downloading the perl application. I will update you once that is done.

Clarification of Question by octopia-ga on 14 Jan 2003 05:53 PST
Hi studboy, 

I have just completed installing perl, and m ready for your suggestions.

Thanks a lot for your help...

Request for Question Clarification by studboy-ga on 14 Jan 2003 11:28 PST
OK, the first step is identifying the separators for you input and output docs. 
 
The text file you posted on here (due to the fact that it's 
posted and the Google Answers treat it as all text) the CS's 
are lost--I need: 
 
1) For your input file--I need some help with the company name part: 
is it possible that you can preceed each company name with something 
like  
 
Company: 
 
?  
 
Simply put, when you try to the file, the program needs a way to separate 
the records.  Having an identifyer helps. 
 
2) For your output files, I assume the separator is a TAB, right? 
It cannot be spaces because some of the fields (like addresses) have spaces 
in there. 
 
As you as you help me with these two questions I will help you move on.

Clarification of Question by octopia-ga on 14 Jan 2003 21:33 PST
Hi Studboy. That sounds like a deal. I have corrected the sample by
adding "Company: " before each company name. Here are the results....
---------------------------------------------------------------------------
Company: Abba Electronics LLC
PO Box 327 Dubai United Arab Emirates
Tel: +971 (4)3371800 Fax: +971 (4) 336 4283 E-mail: abba ©emirates,
net. ae Website: www.alabbas.com
KEY PERSONNEL
Chairman 	 Mr Ibrahim Al Abbas
General Manager 	 Mr Abdul Quadir (E-mail: abdul.quadir@abbauae.com)
Chief Accountant 	 Mr Uday Poojary (E-mail: uday.poojary@abbauae.com)
Sales Manager 	 Mr Prakash Kelkar (E-mail: prakash.kelkar@abbauae.com)
Operations Manager 	 Mr Dinesh Shetty (E-mail:
dinesh.shetty@abbauae.com)
Human Resources Manager 	 Mr John Tarian (E-mail:
john.tarian@abbauae.com)
LOCAL STATISTICS
No of employees: Under 100 Sales Volume: US$ 0-5mn Company locally
established: 1 979 company locally established: 1 994
BUSINESS ACTIVITY
Abba Electronics (UAE) provides business systems, system integration
and channel distribution. Distributors for Data Card, Pitney Bowes,
NEC and Casio. Member of Al Abbas Group (UAE).
INDUSTRY CLASSIFICATION
Consumer Goods, Electronics/Electrical, Services
NATIONALITY /TRADE AFFILIATION
United Arab Emirates, Japan, USA
DISTRIBUTOR FOR
Pitney Bowes, USA
DISTRIBUTOR FOR
NEC Corporation, Japan
DISTRIBUTOR FOR
Casio Computer Co Ltd, Japan
Company: Abbar & Zainy Cold Stores
PO Box 6066 Sharjah United Arab Emirates
Tel: +971 (6) 533 2793 Fax: +971 (6) 533 0099 E-mail:
abzashj@emirates.net.ae
KEY PERSONNEL
Managing Director 	 Mr Camilo Venegas
General Manager 	 Mr Mazen Barakat
LOCAL STATISTICS
No of employees: Under 100
BUSINESS ACTIVITY
Abbar & Zainy Cold Stores (UAE) trades in food products and fruits.
Agent for Dovex Export Co (USA).
INDUSTRY CLASSIFICATION
Agriculture/Environmental, Food & Drink
NATIONALITY /TRADE AFFILIATION
United Arab Emirates, USA
Abbott Laboratories Regional Office
PO Box 32002                                                          
                                                                      
                   1 Dubai                                            
                                                                      
                                    4 United Arab Emirates
Tel: +971 (4)221 2711                                                 
                                                                      
                ( Fax: +971 (4)2231926                                
                                                                      
                                J Website: www.abbott.com
KEY PERSONNEL
General Manager 	 Mr Elie Abdelkarim
LOCAL STATISTICS
No of employees: Under 1 00
BUSINESS ACTIVITY
Abbott (USA) specialises in five broad areas, including pharmaceutical
products, chemical and agricultural products, nutritional products,
diagnostic products and hospital products. Abbott Laboratories employs
60,000 people globally and                         j markets its
products in more than 1 30 countries. All Abbott's regional operations
throughout Europe, Asia and the Americas are managed from the head
office.
INDUSTRY CLASSIFICATION
Pharmaceuticals/Medical
NATIONALITY /TRADE AFFILIATION
USA
SUBSIDIARY OF
Abbott Laboratories, USA
Company: Abbott Laboratories SA
PO Box 535 18                                                         
                                                                      
                     | 403 API World Tower Sheikh Zayed Road Dubai    
                                                                      
                                                                      
 ( United Arab Emirates
Tel: +971 (4) 332 7862 Fax: +971 (4) 332 7904                         
                                                                      
                                       ( Website: www.abbott.com
KEY PERSONNEL
Country Manager 	 Mr Aslam Hawa
Finance Manager 	 Mr Syed Zakiuddin
Marketing and Sales Manager 	 Mr Kinjal Zaveri
LOCAL STATISTICS
No of employees: Under 100 Regional Head Office: Middle East Company
locally established: 1 981
BUSINESS ACTIVITY
Abbott Laboratories (United Arab Emirates) is the diagnostics division
of Abbott Laboratories (USA).
INDUSTRY CLASSIFICATION
Pharmaceuticals/Medical
NATIONALITY / TRADE AFFILIATION USA
SUBSIDIARY OF
Abbott Laboratories, USA
Company: Abdul Jalil Crocker Partnership, The
PO Box 9259 UAE Enterprises Building Airport Road Dubai United Arab
Emirates
Tel: +971 (4) 286 8687 Fax: +971 (4) 286 8646 E-mail: ajcpdxb
©emirates, net. ae
KEY PERSONNEL
Chief Executive Officer 	 	 	 Mr Stuart Johnson
LOCAL STATISTICS
No of employees Under 100 Annual Sales Volume US$20-40mn Regional Head
Office Middle East Company locally established 1 976
BUSINESS ACTIVITY
Abdul Jalil Crocker (UAE) provides structural and civil engineering
consultancy.
INDUSTRY CLASSIFICATION
Construction/Engineering, Consultancy
NATIONALITY /TRADE AFFILIATION
United Arab Emirates, United Kingdom
SUBSIDIARY OF
RJ Crocker & Partners, United Kingdom
Company: Abduli & Baker Co
PO Box 2745 Tariq Bin Zayed Road Sharjah United Arab Emirates
Tel: +971 (6)5625810 Fax: +971 (6)5616625 E-mail: ahujad@emi rates,
net.ae
KEY PERSONNEL
Managing Director 	 Mr Naresh Ahuja
Marketing/Sales Director 	 Mr Raju Thomas
Director 	 Mr Dilip Ahuja
Accountant 	 Mr Naresh Ahuja
LOCAL STATISTICS
No of employees: Under 1 00 Annual Sales Volume: US$ 0-5mn Company
locally established: 1975
BUSINESS ACTIVITY
Import, export, wholesale and retail of electrical consumer products.
Indian/UAE joint venture. Annual sales volume in the UAE is under
US$5mn.
INDUSTRY CLASSIFICATION
Trade, Electronics/Electrical, Retail
NATIONALITY /TRADE AFFILIATION
India, United Arab Emirates
Company: ABN Amro Bank NV
Khaleed BinWhaleed Road PO Box 2567 Bur Dubai United Arab Emirates
Tel: +971 (4) 351 2200 Fax: +971 (4)351 1555 E-mail: aabdubai®
emirates, net.ae Website: www.abnamro-uae.com
KEY PERSONNEL
Country Representative 	 Mr Tom Zwaan
Deputy Country Manager 	 Mr Domzwaan
Financial Controller 	 Mr Suresh Sampat Kumar
Director, Global Trade and Advisory, EMEA 	 Mr Daniel Cotti
Director, Corporate Cash Management, EMEA 	 Mr Alan Verschoyle-King
Business Support Services Manager 	 Mr Narendra Gajria
IPB Manager 	 Mr Bala Krishnan
Operations Co-ordinator 	 Mr Amitabh Das
Treasurer 	 Mr Mir Waqas Ellahi
LOCAL STATISTICS
No of employees: 1 00+ Company locally established: 1973
BUSINESS ACTIVITY
ABN AMRO Bank (UAE) is a full service banking operation addressing
local, national and international business interests as well as
offering an extensive range of corporate, private banking, treasury
and retail services. Branch offices are located in Abu Dhabi and
Sharjah.
INDUSTRY CLASSIFICATION
Banking/Finance
NATIONALITY /TRADE AFFILIATION
Netherlands
REPRESENTATIVE OFFICE FOR
ABN AMRO Bank NV, Netherlands
Company: ABN Amro Bank NV
PO Box 1971 Sharjah United Arab Emirates
Tel: +971 (6) 559 4900 Fax: +971 (6)5591009 E-mail: aabshj® emirates,
net.ae Website: www.abnamro-uae.com
KEY PERSONNEL
Branch Manager 	 Mr Rajeev Jain
Operations Manager 	 Mr Mahmood H Khan
LOCAL STATISTICS
No of employees: 500+
INDUSTRY CLASSIFICATION
Banking/Finance
NATIONALITY /TRADE AFFILIATION
Netherlands
SUBSIDIARY OF
ABN AMRO Bank NV, Netherlands
Company: ABN Amro Bank NV
PO Box 2720 Abu Dhabi United Arab Emirates
Tel: +971 (2) 633 5400 Fax: +971 (2)6330182 E-mail: abnamro© emirates.
net.ae Website: www.abnamro-uae.com
KEY PERSONNEL
Corporate Manager 	 MrTanveer Islam
General Affairs Manager 	 Ms Rachel Duston
LOCAL STATISTICS
No of employees: Under 100 Company locally established: 1 975
INDUSTRY CLASSIFICATION
Banking/Finance
NATIONALITY /TRADE AFFILIATION
Netherlands
SUBSIDIARY OF
ABN AMRO Bank NV, Netherlands
Company: Abu Dhabi Maritime & Mercantile International Co (ADMMI)
PO Box 247 Abu Dhabi United Arab Emirates
Tel +971(2)6273131 Fax +971 (2) 626 9661 E mail admmige n@ emirates
net ae Website www admmi com
KEY PERSONNEL
Partner                                                               
                      Mr R Al Masaood Finance Director                
                                                             Mr
Nagarajan General Manager                                             
                               Mr John Aves
LOCAL STATISTICS
No of employees Under 100 Company locally established 1 965
BUSINESS ACTIVITY
Industrial caterers, cold store operators, retailers, wholesalers of
foodstuff, travel, shipping and engineering services
INDUSTRY CLASSIFICATION
Food & Drink, Retail, Services, Transport, Tounsm/Travel/Leisure
NATIONALITY /TRADE AFFILIATION
United Kingdom, USA
ASSOCIATED WITH
Inchcape pic, United Kingdom

Clarification of Question by octopia-ga on 14 Jan 2003 21:40 PST
Hi Studboy, 

For the point of output files, you are right, it can not be spaces.
So, having a tab-seperated value is no problem.

Clarification of Question by octopia-ga on 14 Jan 2003 21:40 PST
Hi Studboy, 

For the point of output files, you are right, it can not be spaces.
So, having a tab-seperated values is no problem.

Request for Question Clarification by studboy-ga on 15 Jan 2003 03:30 PST
Hi Octopia

The input file has too many discrepancies--
for example, in the last record, E-mail is E mail
sometimes the emails are typed wrong--like abba @emirates, net .ae

I think Google Answers might have garbled up part of your file as
well--is it possible you can tell me where the original file comes
from and/or upload it to your website so I can get a cleaner look? 
Thanks.

Clarification of Question by octopia-ga on 15 Jan 2003 22:28 PST
hi studboy, 

Yes, I agree, this is sometimes annoying. Google answers didn't change
the data. Rather, that's the way it came from the source. This data
goes in a "cleaning-up" stage after it is processed. That's no
problem, though, I am used to fixing that with VB text formulas.

I didn't get if this is going to affect the perl processes? For key
words, though, I think they are all OK. Please let me know what you
think...

Request for Question Clarification by studboy-ga on 16 Jan 2003 00:49 PST
First of all, I think you're missing a Company: in front of--


Abbott Laboratories Regional Office 
PO Box 32002

                   1 Dubai

                                    4 United Arab Emirates
Tel: +971 (4)221 2711

                ( Fax: +971 (4)2231926

                                J Website: www.abbott.com

It would be nice too if you can have Address: in front of the
addresses, etc.  But no matter.  Add the Company: to the above entry
in your input file (say you call it input).
Save the script I post next in a file call process.pl.
Then do:

perl process.pl input > output

Use an editor or XL to look at output.  I only process the beginning
columns to give you the idea: you can extend it to the rest of the
entries.

Let me know how it goes.

Request for Question Clarification by studboy-ga on 16 Jan 2003 00:52 PST
#!/usr/local/bin/perl
  
$usage = "Usage: process.pl inputfile > outputfile\n";

# get arg's

if ( ( $#ARGV != 0 ) || ( $ARGV[0] =~ /\-h/i) ) {
    print  $usage; 
    exit 1;
} else  {
    $infile = shift;
}

open(FH1, "$infile") || die "Can not open input file \"$infile\": $!. \n";  

$/ = undef;

# read everything from file
@chunks = split(/^(?=(?:Company:))/im, <FH1>);

close(FH1);

# print the header
print "Company\tName\tAddress\tPhone\tFax\tEmail\n";

for ($i = 0; $i < scalar(@chunks); $i++) {
   $company = $address = $phone = $fax = $email = "";

   $rest = $chunks[$i];
   $rest =~ s/Company://g;

   # extract the company name
   @chunky = split(/\n+/, $rest);
   $company = shift(@chunky);
   $rest = join(' ', @chunky);

   # extract the address phone fax

   @chunky = split(/Tel[: ]/, $rest);
   $address = shift(@chunky);
   $rest = join(' ', @chunky);

   @chunky = split(/Fax[: ]/, $rest);
   $phone = shift(@chunky);
   $rest = join(' ', @chunky);

   $temp = $rest;

   @chunky = split(/E[- ]mail|Website/, $rest);
   $fax = shift(@chunky);
   $rest = join(' ', @chunky);

   # extract the email
   if ($temp =~ /E[- ]mail[: ](.*)Website.*KEY/) {
      $email = $1;
   } elsif ($temp =~ /E[- ]mail[: ](.*)KEY/) {
      $email = $1;
   }

   print "$company\t$address\t$phone\t$fax\t$email\n";

}

exit 0;

Request for Question Clarification by studboy-ga on 17 Jan 2003 10:57 PST
Hi Octopia

Any luck with the script?

Thanks

Clarification of Question by octopia-ga on 17 Jan 2003 21:51 PST
Hi studboy, 

Sorry for not responding any earlier. A few things came in the way.
Thank you for posting the code, I am getting more excited about
getting this to work.

I made the necessary modifications to the data source file, and named
it “source.txt”. I tried to navigate the IndigoPerl Console to see
where to start the next steps but had no luck. So, I have just a few
questions (sorry, I know these could sound very basic):
-	Right now I am on a IndigoPerl Console which is opening in an
Internet Explorer window. How do I get to create the new file
“process.pl”?
-	After creating the process.pl file, how do enter the script you
mentioned above in this file?
-	Where can I enter the command “perl process.pl input > output”? In
other words, where is the UI that I can place the command in?
-	The output file, where does it get stored? 
Thanks…..

Request for Question Clarification by studboy-ga on 18 Jan 2003 01:52 PST
Hi Octopia

1) Use NotePad, create a file called process.pl (make sure it's named
process.pl, not process.pl.txt.  Well actually, name it anything you want.
Cut and paste my code into it.

2) Type Start -> Run -> cmd

That would bring up the dos command window.

cd to where your files are.

3) Type 


perl process.pl source.txt > out

The out file will be created in the same directory.

Open it with NotePad and/or XL.

Clarification of Question by octopia-ga on 19 Jan 2003 00:11 PST
Hi Studboy,

Let me tell you, this is just Great! It would save a lot of processing
and I love it.

Yes, this would be the best answer to my question. For the answer,
would you please elaborate on what do I need to change in the script
to add more key words to extract other fields.

Something I am curious about is if the script did not fine the key
word in a record, it does not extract the right data. Some records do
not have - say - a fax number (or key word "Fax:"). Is there a way to
change the script so that if does not find a key word, it will leave
the relative cell empty and carry on to the next key word?

Another question is for the Key Personnel. For a company record, you
can have 4-5 Key Personnel names come without the word "Key Personnel"
preceeding them. However, they come in different lines (with line
breaks). So, is there a way to detect a new value for each line? There
is another complication which is the number of Key Personnel changes
from on company to the other. Please let me know if this can be solved
by perl.

Looking forward to your reply, and thanks for your efforts.

Request for Question Clarification by studboy-ga on 20 Jan 2003 23:47 PST
Hi Octopia

Regarding your first question--
basically there are two ways to extract the fields--
using the split and pop method 

   @chunky = split(/\n+/, $rest); 
   $company = shift(@chunky); 

or
using the match and extract method

   if ($temp =~ /E[- ]mail[: ](.*)Website.*KEY/) { 
      $email = $1; 
   } 

If Fax is not a certainty (in the samples you gave me Fax is a certainty,
but supposed it is not), then the match and extract method is
recommended because if there's no match, it will be blank.

As to your second question, line breaks are recognized as "\n"
so you can match on that to split the PERSONNELS.

I can close this question with some references to Perl Programming
if you like.  

Thanks and glad I can help.

Clarification of Question by octopia-ga on 24 Jan 2003 03:12 PST
Hi studboy, 

Thanks for the information. I am sure that perl has the right solution
in this situation.

I understood the point about the two separation methods, and I agree,
the second method will do the trick. So, what you are saying is for
fields that we are not sure exist in all records, we should use Match
& Extract.

I am really interested in seeing how the line breaks will be
recognized in the final output for the PERSONNEL fields. I will be
looking forward to seeing how the final code will work based on the
sample data in the answer. By the way, for the personnel entries, it
would be enough if perl can get each contact entry (i.e., contact
title, name, contact email) in one cell, as I can apply some quick
processing to separate those later on. I am saying that because I
realized that these specific values do not have a preceding key word
(e.g., “contact name:” or “contact title:”).

Thanks a lot for offering resources on perl programming. I am sure
they will be very useful.
Answer  
There is no answer at this time.

Comments  
Subject: Re: Text reports converter (techtor, feilong, cipher17, lot)
From: miacid-ga on 12 Jan 2003 09:35 PST
 
I use TextPipePro to do text manipulation.
http://www.crystalsoftware.com.au/
I had to work hard to learn about what are called "Regular
Expressions" but once getting some facility and with some trial and
error I found the program to have amazing power. I got help from the
book "Mastering Regular Expressions" by Jeffrey E. F. Friedl published
by O'Reilly
http://www.oreilly.com/catalog/regex/
I would appreciate your comment if this helps or not.
Subject: Re: Text reports converter (techtor, feilong, cipher17, lot)
From: rac-ga on 12 Jan 2003 10:52 PST
 
Hi,
Can you post the sample of your scanned report and the output you
wanted. It will be easier to write vba macro for a specific report
than a more generalised one.

Thanks,
PVA
Subject: Re: Text reports converter (techtor, feilong, cipher17, lot)
From: octopia-ga on 12 Jan 2003 11:09 PST
 
Thanks for your suggestion, miacid, I am trying now the TextPipePro,
and will post results very soon.
Subject: Magic Answer for you
From: talhacelik-ga on 13 Jan 2003 16:41 PST
 
A software for you. 156 language supported. Here is the key: Abbyy
Finereader 6.0 The newest version for you. You must firstly try it.
use this site to download the program www.abbyy.com if you enjoy,
report me pls. If you answer your need, I will be very happy byes
Subject: Re: Magic Answer for you
From: octopia-ga on 13 Jan 2003 20:49 PST
 
Hi talhacelik

Thanks a lot is for your suggestion. Abbyy is my favorite. It is the
best OCR software I have used so far.

The solution that I am looking for, however, is for processing the
data after it become in editable text format. In other words, after
you have the data in electronic format, you need to process it to put
each value in the corresponding cell, in order to build up a workable
csv file. So, it is about an intelligent application/solution that
could take a text block and realize which values are which, and deals
with it on that basis.
Subject: Re: Text reports converter (techtor, feilong, cipher17, lot)
From: studboy-ga on 14 Jan 2003 11:28 PST
 
OK, the first step is identifying the separators for you input and output docs.

The text file you posted on here (due to the fact that it's
posted and the Google Answers treat it as all text) the CS's
are lost--I need:

1) For your input file--I need some help with the company name part:
is it possible that you can preceed each company name with something
like 

Company:

? 

Simply put, when you try to the file, the program needs a way to separate
the records.  Having an identifyer helps.

2) For your output files, I assume the separator is a TAB, right?
It cannot be spaces because some of the fields (like addresses) have spaces
in there.

As you as you help me with these two questions I will help you move on.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy