Hello Mathtalk -
Judging from your other answered questions, I figured I would direct
this question to you. Hope you don't mind. :)
I need a program written for XP. Don't know if it's doable?
I have several Microsoft Word documents: each of them are songs
containing lyrics with chords, all written in Times New Roman, arial,
or some other non-fixed-width font. For example, here's what a few
lines would look like:
C D G
HERE IS A SONG WITH LYRICS
C D G
AND THIS IS WHAT IT LOOKS LIKE
I need the documents to be converted to a fixed width font such as
courier, but the position of the chords remains the same relative to
the lyrics underneath. I can't just highlight the entire document and
switch the text to courier because the chords end up moving out of
sync in relation to the lyrics.
So I need an executable program I can run on XP which will convert an
MS Word document to a fixed width font such as courier, but retain the
relative position of the chords in relation to the words underneath.
It would be great if the program could run in batch mode, to convert
multiple files in one fell swoop. No fancy buttons or interface
needed. It could simply be a command-line program.
The reason I need this program written is so that I can simply cut and
paste these documents into another program that will allow me to
transpose them into another key easily, for a songbook.
If a researcher/programmer could write this and provide a link to the
program to download, that would be excellent.
Please let me know if you need any clarification.
Thanks!
Barryf |
Clarification of Question by
barryf-ga
on
19 Sep 2004 10:34 PDT
Rewrite of one sentence:
I need the documents to be converted to a fixed width font such as
courier, but the position of the chords *must remain* the same relative to
the lyrics underneath
|
Request for Question Clarification by
leapinglizard-ga
on
19 Sep 2004 11:28 PDT
If there's a systematic way of determining from the text itself how
the chords and words line up, then I could write such a program for
you. But if the only way to see the alignment is by looking at the way
your document is graphically rendered in Word, it becomes an exercise
in human perception that can't be solved without some pretty serious
artificial-intelligence work.
leapinglizard
|
Clarification of Question by
barryf-ga
on
19 Sep 2004 13:19 PDT
Hello leapinglizard -
Do you have any ideas? :) Are there any functions to determine where
a character is in relation to another? Like a grid?
b
|
Request for Question Clarification by
leapinglizard-ga
on
19 Sep 2004 13:36 PDT
Yes, but such grids can only measure the placement of the characters
themselves -- letters, numbers, blanks -- and not of the pixels in
Word's graphical rendering of the page, you see? It's theoretically
possible to consider the pixels, but a practical implementation would
take days and weeks on the part of a computer-graphics specialist and
a computer scientist.
leapinglizard
|
Clarification of Question by
barryf-ga
on
19 Sep 2004 13:51 PDT
I see what you're saying. Perhaps that work is done, however? Are
there public domain OCR libraries that can detect where characters
are? For example, if the program could detect a "chord line" and then
detect a "lyric line" below it, could it merge the two:
from:
C D C
THIS IS A SONG TO BE SUNG
to this:
thi<C>S is a <D>song to be s<C>ung
Then it would convert the merged lines into courier, and then parse
out the chords with "<>" characters and insert them in the line above?
Perhaps it's just too complicated pragmatically.
|
Request for Question Clarification by
leapinglizard-ga
on
19 Sep 2004 14:57 PDT
You've got exactly the right idea about OCR. It would be,
unfortunately, a very difficult task without custom software. My
educated guess is that no such software exists, and to have it written
to order would be a very costly and time-consuming procedure.
I've thought of two other approaches which, while much less
complicated than OCR, would still require significant effort to design
and develop.
First, it may be possible to apply machine-learning techniques to your
problem. You would manually prepare a training set consisting of, say,
twenty to fifty documents translated from Times to Courier. On the
basis of these human judgments, the machine-learning algorithm would
construct a statistical model that lets it predict, for a heretofore
unseen document in Times format, what the Courier equivalent looks
like. You would apply this model to the remaining documents, thereby
getting the work done on a mostly automatic basis. There would
probably be errors in the resulting translations, but not necessarily
more than would result from the OCR approach.
The other notion I have is one that depends on a hypothetical premise.
Namely, if each character of text is rendered in the Times font as a
bitmap with a fixed width regardless of the context of that character,
and if that fixed width can be determined for every character
appearing in your documents, then the chord aligments can be exactly
calculated. However, I'm not at all sure that this is the case. Due to
kerning and other fancy typographical effects, it's possible that the
width of a rendered character varies with its context. Even if it's
fixed, I don't know how one would go about finding the graphical width
of each character. Well, I suppose one could always magnify a
screenshot and start counting pixels.
leapinglizard
|
Clarification of Question by
barryf-ga
on
19 Sep 2004 15:51 PDT
Interesting...
What about what efn says below? There are functions that measure the
size of characters? I have Conversions Plus by Dataviz, and I could
batch convert these files (about 70 of them) to RTF, text, or
otherwise. I've also increased the price of the question.
barryf
|
Request for Question Clarification by
leapinglizard-ga
on
19 Sep 2004 16:09 PDT
Indeed, efn's thoughts and mine appear to have crossed in the mail. If
efn has the time and inclination to solve your problem, I encourage
him to claim it. I will only contemplate answering your question if no
one else expresses an interest and I have plenty of time on my hands,
since I would have to install Windows on my spare machine and launch
an inquiry into the whereabouts of this string-measuring API function.
I'm primarily a Unix programmer, you see.
leapinglizard
|
Request for Question Clarification by
mathtalk-ga
on
20 Sep 2004 04:13 PDT
Hi, barryf-ga:
Thanks for posting the Question to my attention. I saw
leapinglizard-ga's Requests for Clarification and was somewhat puzzled
by the equivocal level of interest expressed at the end.
I believe that the problem might be attacked from a Rich Text format
approach, since you say the documents are in Word. Without a sample
of the document, it would be hard to say if this is "doable".
regards, mathtalk-ga
|
Clarification of Question by
barryf-ga
on
20 Sep 2004 07:00 PDT
mathtalk -
Yes, I apologize for that. It was in the original drafting of the
question, and I forgot to remove it after directing it to you. Sorry
for the confusion. You do have precedence on the question.
What's the best way for me to send you a file?
barryf
|
Request for Question Clarification by
mathtalk-ga
on
20 Sep 2004 07:53 PDT
Hi, barryf-ga:
The only option to share a file is to post a link (URL) to a location
from which it can be downloaded. If you need
suggestions/recommendations on how to arrange this, I'd be happy to
outline some inexpensive (free) ways to do this.
regards, mathtalk-ga
|
Clarification of Question by
barryf-ga
on
20 Sep 2004 08:58 PDT
mathtalk -
It occurred to me, that since the files are all pretty much the same
concept of chords followed by lyrics underneath, etc. that if you had
a copy of Word and cut/pasted the following, would this do it?
D F C
This is a lyric line number one
em/C D G F
This is lyric line number two
E Bb G
This is lyric line number three
E G
Chorus: This is lyric line number four
F
This is lyric line number five
G em/G
This is lyric line number six
Every file will be different, in that each has different lyrics and
different chords; but the same concept applies, pretty much. They're
all in Arial or Times New Roman.
barryf
|
Clarification of Question by
barryf-ga
on
20 Sep 2004 08:59 PDT
Also, sometimes the text is bold, and sometimes it's underlined.
|
Request for Question Clarification by
mathtalk-ga
on
20 Sep 2004 09:47 PDT
Hi, barryf-ga:
The Question hinges on whether the MS Word documents that you have can
be processed programmatically to produce the chord annotation in a
fixed width font.
To that end a cut-and-paste job from the fixed width font here back
into Word would not help much (to settle the original Question).
Document markup seems inevitably to allow many ways to obtain the same
"appearance". The most decisive approach would be to look into the
files you actually need to convert.
If you prefer, you can look into the Rich Text Format (RTF) content
yourself. Save As... one of your files in a .rtf format, and use a
plain text editor to view the result. There's bound to be a bunch of
"stuff" at the top which defines the fonts/alphabets used by the file,
and you can skip over that until you come the lyrics, which should be
recognizable despite occasional intrusions of weird tags/curly
brackets of various kinds.
regards, mathtalk-ga
|
Clarification of Question by
barryf-ga
on
20 Sep 2004 10:18 PDT
mathtalk -
If you could clarify, for my own understanding: All things being equal
(you running a copy of Word, and me running a copy of Word), does it
matter how the sample text looks in Word once you cut and paste? It's
just sample text, so it doesn't even matter where the sample chords
are over the sample lyric lines. So long as the converted file looks
the same as the original (only in fixed-width font). I cut and pasted
that sample text into my copy of Word, and it showed up fine in Times
New Roman. If I were to save that as RTF or DOC, that would be the
type of file we would be working with. Theoretically, shouldn't I be
able to create a new file on your copy of Word, save it as the
appropriate file type, and convert the file, if I wanted to create a
new song, for instance? I can certainly find a way to get you the
file, but I want to know if that would work.
Thanks,
barryf
|
Clarification of Question by
barryf-ga
on
20 Sep 2004 10:42 PDT
(It would be nice to know this could work, in the event I have to
create and/or convert a file on another computer)
|
Clarification of Question by
barryf-ga
on
20 Sep 2004 12:49 PDT
Here's one of the files: ftp://69.200.128.2/
|
Request for Question Clarification by
mathtalk-ga
on
20 Sep 2004 13:20 PDT
Hi, barryf-ga:
If several Word documents have a similar internal structure, then it
makes sense to try and write a program that can reformat files "like"
that. If I write a program that converts a Word document that I made
up, I would not be surprised if it failed to make a satisfactory
conversion of a Word document you made up, even if they both have
similar appearances to what you posted here.
As I mentioned, Word and markup languages in general have many ways to
do the same thing as far as formatting goes, even with regard to what
might seem the simple element of horizontal spacing of words on a
line. We can discuss that in more detail at some point. In the
meantime I'll have a look at the file you posted to see how easy it
will be to convert by a custom program.
regards, mathtalk-ga
|
Request for Question Clarification by
mathtalk-ga
on
20 Sep 2004 19:54 PDT
Hi, barryf-ga:
After playing around a bit with the sample file, I think it is
possible to write a Word VB "macro" that will read through this kind
of document and output a plain text file that serves your purpose.
Let me think a bit more about the problem.
regards, mathtalk-ga
|
Clarification of Question by
barryf-ga
on
21 Sep 2004 16:37 PDT
Hey mathtalk -
Sorry for the delay in responding. Couldn't access my account.
If it seems like something overwhelming, let me know, because I may
have another idea.
barryf
|
Request for Question Clarification by
mathtalk-ga
on
21 Sep 2004 20:15 PDT
Hi, barryf-ga:
I think a Word "macro" (actually a VBA program) has a lot of
advantages. It would be somewhat portable across versions of
Word/platforms, so you could use this with essentially any computer
that had Word on it. VBA is about as entry level a language as any
these days, so finding other programmers to help maintain and extend
the program would be relatively easy. It would be pretty
straightforward to incorporate the "key transposition" feature.
The limitations are pretty much those that will apply to any program
that you could apply to the task at hand, namely working within some
defined framework of what sort of input and what sort of output are
acceptable.
In the sample file you posted we can identify several elements of the
input document. Besides the blank lines, the lines with chords, and
the lines with lyrics, there is a title, a left hand
margin/indentation, and various kinds of annotations that "fall into"
that margin (verse X, chorus, copyright notice).
So what assumptions can be made about these input elements? Will
there always be a title? Will it always fit on one line? How will we
identify the first line of the lyrics? the extent of the left hand
margin? How will we know we've reached the end of the song? What
besides major/minor chords (and slashes separating them) will be
present above (or below??) the lyrics?
Then on output, what assumptions can be made? Do you need to conserve
the title and the copyright notice, or (as suggested by the original
wording of your Question) is it desired to show only the chords and
lyrics in a fixed-width font? How wide should the left margin be (if
any) and should the verse/chorus annotations be preserved there?
Clarifying these requirements is what any professional
programmer/analyst would do with a client before attempting to code a
solution.
I'm not the world's fastest typist by any means, but it appear to me
that it would take me about a minute to cut-and-paste the text of one
song into a programmer's editor (like TextPad or DevStudio, the Visual
Studio editor) and clean up the plain text version. $35 buys a
relatively large amount of secretarial support in this regard, and
comparatively little programming support.
If you are a programmer who wishes to do this as an exercise, I'd be
happy to be your mentor and work through it with you, providing a
prototype program to get you started. I think you'd get your $35
worth in programming knowledge, though I think you'd need to convert
hundreds of songs before you'd break even on the combined cost of the
programming effort and the Question.
regards, mathtalk-ga
|
Clarification of Question by
barryf-ga
on
21 Sep 2004 21:37 PDT
mathtalk -
I appreciate you looking into this. You bring up a lot of good
points. I'm actually not a programmer (technically speaking),
although I understand some of the rudiments. Given the number of
variables and issues at stake, I think a different approach might be
in order. Here's a new proposition:
Perhaps you could write a macro that prompts the user for two inputs:
1) the existing key and 2) the desired new key to the song. It then
transposes the song right within Word. It occurred to me that this
would actually make more sense, since my ultimate goal is to be able
to easily re-key these songs. This would not require any
consideration of spacing, fonts, or the like. I've found a chart below
that shows what the chords are for each key:
http://www.guitarforbeginners.com/capo.html
Some of these songs have more complicated chords, with adjacent
superscripts such as 9th's, 11th's, etc. The program could simply
identify the fundamental chord and leave everything to the immediate
right of it the same (with the exception of encountering a "/"
character). For example:
Cm(Add9b5)/Bb
In this case, the program would identify the first character (C) and
change it to the appropriate new root chord letter; it would then scan
for a "/" character; if found, it would advance to the next, immediate
alpha character (B), changing that to the appropriate new root chord
letter. The "(Add9b5)" would remain intact, as would any accidentals
or additional characters. It would then advance to the next chord (or
chord combo). The end of the line I believe could be a hard return
character. I don't know the optimum way for the program to determine
a "chord line" vs. a "lyric line." There's no consistency in regards
to the use of tab characters and space characters. However, as the
first criterion, any line containing am, bm, cm, dm, em, fm, or gm
(case insensitive) should be considered a "chord line" immediately.
Perhaps the next criterion could be to see if there are any uppercase
characters, A through G, either prepended or appended by a tab, a
numerical character, or more than 2 spaces? There may be a better
way.
The program would make its way through to the end of the document,
processing all chord lines and be finished when there are no more
chord lines to process.
Does this sound viable?
barryf
|
Clarification of Question by
barryf-ga
on
21 Sep 2004 21:39 PDT
Oops, I overlooked your line:
"It would be pretty straightforward to incorporate the "key
transposition" feature."
That's pretty much what I'm looking for. Sorry for spelling that all
out... you obviously have musical knowledge.
|
Clarification of Question by
barryf-ga
on
22 Sep 2004 14:45 PDT
Hey mathtalk -
Are you still there? Would you be able to write this?
barryf
|
Request for Question Clarification by
mathtalk-ga
on
23 Sep 2004 05:38 PDT
Hi, barryf-ga:
How quickly do you need it? Even with the new, simplified approach,
I'd expect a good effort on this would take several hours of coding +
testing, and while I might find time to work on it over the next 2-3
weeks, the list price offered is (for me) not at the "drop everything
else and work on this now" level.
I've not locked the Question and would not do so until I do have time
to concentrate on it. In the meantime I'd be happy for you if another
Researcher decided to jump in and take on the responsibility. Of
course he or she should carefully review your Clarifications to
understand the modified requirements.
regards, mathtalk-ga
|
Clarification of Question by
barryf-ga
on
23 Sep 2004 08:57 PDT
mathtalk -
I understand. I would like to have it done within the next couple
weeks. Although I'd love to have you work on it, is it something I'm
better off reposting as a new question "for all?"
barryf
|
Clarification of Question by
barryf-ga
on
23 Sep 2004 10:35 PDT
mathtalk -
Actually upon rereading your clarification, that would seem like the
thing to do. I'll go ahead and post it as a new question for
clarity's sake. Thanks for your help.
barryf
|