Google Answers Logo
View Question
 
Q: XML encoded format ( Answered 5 out of 5 stars,   2 Comments )
Question  
Subject: XML encoded format
Category: Computers > Software
Asked by: nrunner-ga
List Price: $50.00
Posted: 09 Sep 2005 10:48 PDT
Expires: 09 Oct 2005 10:48 PDT
Question ID: 566105
I am bidding on a Government contract to provide Transcription
services to a VA hospital. Our transcriptionists listen to dictation
called dirtectly into our server and stored as .WAV files. The
transcriptionist then transcribes the dictation to text using
Microsoft WORD format. Documents are then transmitted back to our
server for QA and subsequent upload to the VA hospital. Transmission
of transcribed documents will be via the Web in Microsoft WORD format
utilizing an FTP protocol. One of the items in the Performance work
statement says:
"The Contractor will insure that all transcriptions are stored and
made available in an XML-Enabled Format."

I do not understand this requirement. Do I need special software to
conform to this requirement.

Clarification of Question by nrunner-ga on 12 Sep 2005 20:15 PDT
Pianoboy-77,
Thank you for your comments. In response I would like to reiterate
that I am on shakey ground here. I am not very familiar with the XML
format that I am inquiring about. That being the case let me try to
answer your questions.

I don't know if there is a specific XML schema. To date, none has been
proposed by the VA hospital, but that may change. I assume that you
mean specific forms. As I understand it the rquirement to parse to
other applications may just be why they want XML in the first place.
We will be sending documents back to the VA in word format,
electronically, and some of the information would be used in different
parts of the Electronic Medical Record, Such as Demographic
information for Admissions, Discharge and Transfer's. In addition to
Radiology reports, Pharmacology, Pathology,Surgery reports and such.
So yes I would say that the information provided in dictation and then
transcribed would be parsed for posting to the Electronic Medical
Record.

The only information that they have provided is in the sentence I
quoted "The Contractor will insure that all transcriptions are stored
and made available in an XML-Enabled Format". I am not sure they know
what XML is either. They just think they know that they need it.

Does it need to be human readable? I suppose it does when it is
printed or called up on a computer screen. Otherwise it's all 0's and
1's to me.

I hope I have given you some clarification and answers to your
questions. As I said, my knowledge is limited in this area.

Thanks for your time.
Answer  
Subject: Re: XML encoded format
Answered By: leapinglizard-ga on 27 Sep 2005 07:47 PDT
Rated:5 out of 5 stars
 
Dear nrunner,

I agree with you that there's a good chance that the people who wrote
the bit about "XML-Enabled Format" don't know what they're talking
about. Nonetheless, you can make a favorable impression in your bid by
demonstrating that XML is not an alien notion to you.

XML is a method of organizing information in a form that does not
require any particular software package, such as Microsoft Word, to
process. An XML document is easy to parse by programmatic means because
it consists of text nested inside markup tags that themselves consist
of text. It should also be easily readable by humans by virtue of being
content-transparent. In other words, the structure of the document should
act as a guide to its content.

To make these principles more concrete, let me show you an XML document
that contains a newspaper article.

<article>
    <title> Mice Prefer Cheddar Cheese </title> 
    <date> 2005.09.27 </date>
    <author> leapinglizard </author>
    <body>  
        <paragraph>
            Neuroscientists at MIT report that when faced
            with a choice between runny French cheeses and
            hard Canadian cheddar, 63% of laboratory mice
            prefer the cheddar cheese.
        </paragraph>
        <paragraph>
            Professor Egbert H. Bottomley cautions that the
            experimental results may not be applicable to wild
            mice. "Our mice were raised in a safe, sterile
            environment," said Prof. Bottomley. "Intrepid
            outdoor mice may well prefer the fragrant
            French stuff."
        </paragraph>
        <paragraph>
            The Department of Neuroscience is now seeking a
            contract with a major American food manufacturer
            to explore ways of commercializing these findings.
        </paragraph>
    </body>
</article>

Notice that an XML document contains no information about the typeface
size and style, text justification and spacing, or other presentation
concerns. Only the raw content is presented, so that software further
down the pipeline can easily read it and take care of the details of
page layout and output rendering.

By contrast, a Word document is fully styled, with fonts and italics
and indentation. All of this formatting information is saved either in
a binary file as unreadable cruft, or converted into a Word-specific XML
format that doesn't have anything to do with the content of the document
and is therefore unsuitable for consumption by others.

So the bad news is that a Word document is not, as it stands, an
XML-enabled file format. The good news is that it's not hard to generate
documents that are either ready-made XML or that are easily wrapped into
an XML schema. The key is to stick with text in creating your files, and
I do mean text only. You can't even rely on Word's text output facility
to give you a file that's free of formatting commands.

To make a file that is free of any extraneous information, you will want
your transcriptionists to use a very simple editor that doesn't give the
user any formatting options beyond the carriage return. One such editor
is Notepad, which is built into every Windows installation. Saving a
document in Notepad will result in a pure text document that is readily
converted, whether by a little custom text-processing script or by some
easy manual labor, into an XML document. A plain text file is indeed,
in that sense, XML ready: just add markup!

To make an XML document itself, you first design the schema, which can be
something as simple as the one I used above, and then you instruct your
employees to apply it without fail in writing their transcripts. The
fundamental rules of XML markup are that every opening tag must be
paired with a closing tag, and that pairs of tags must be nested without
intersection.

I hope this primer on the spirit and practice of XML gives you renewed
confidence in your bidding effort.

Regards,

leapinglizard
nrunner-ga rated this answer:5 out of 5 stars
Very clear and concise answer. Thank you for your time and patience.

Comments  
Subject: Re: XML encoded format
From: yiferic-ga on 09 Sep 2005 11:51 PDT
 
Probably not, it simply means the information has to be portable and
easily accessible.

http://www.xml.com/pub/a/98/10/guide0.html?page=2#AEN58
Subject: Re: XML encoded format
From: pianoboy77-ga on 12 Sep 2005 15:46 PDT
 
My answer would be: It depends.

- Can it be any XML, or do they need it in their own customized format
(i.e. they have a certain XML schema they want you to conform to or
can you use any schema)?

- Why do they need it in XML? Is it just so they're not
using/transmitting binary files? Do they need to be able to parse the
XML easily from another application? Does it need to be human
readable?

The answers to these questions affect the answer to your question.

Also, I should mention that the term "XML-Enabled format" doesn't make
sense to me. A document is either represented as XML or it isn't.
Maybe by "enabled" they mean that it must be easy to convert the
format of your files to XML (e.g. in case they're thinking they may
want to use XML in the future).

The good news:
Microsoft Word 2003 has a "Save as XML" feature (select File --> Save
as..., then change the file type from Word Document to XML Document),
which saves your Word documents as XML documents, without loss of any
formatting or any other information. So technically, you can easily
meet the requirement as stated. You can just get your
transcriptionists to use this feature in Word, so all your files
created are automatically in XML format.

The bad news:
Unfortunately, the XML generated by Word is extremely complex, and not
easily human readable or easy to parse. So if the company wants to be
able to parse the XML from another application, or view the XML and
read the transcription easily using a plain text editor, Word's "Save
as XML" feature won't cut it for you. Fortunately, there are some
tools out there that can convert your Word documents to simpler XML
formats (See the links below). If your Word documents are fairly
simple (i.e. no fancy tables, formatting, graphics, etc.), you can
also create your own application fairly easily that will convert Word
documents to XML customized to your liking (see the links below).

Here are some links for you:

General introduction to XML
-----------------------------
http://en.wikipedia.org/wiki/Xml
http://www.w3schools.com/xml/default.asp

Info on tools that convert Word docs to simple XML docs:
----------------------------------------------------------
http://www.xml.com/pub/a/2003/12/31/qa.html

How to create your own customized Word to XML conversion app (using VB.NET):
----------------------------------------------------------------
http://www.devx.com/dotnet/Article/17358/0/page/1

Hope this helps!

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy