I need to design an algorithm to determine if a given (directed) graph
is "semi-connected" in linear time? So far 'I believe' that I need to
have a directed graph implemented in an adjacency list, then run DFS,
then transpose, then DFS again, and then run topological sort. And it
should say yes/no if semiconnected or not, and take O(V+E) time. Any
ideas on the steps needed, and a working version in C/C++? I found
some stuff at http://www.boost.org/ but I don't understand how to put
it all together. I am open to any different ways to conquer this
problem also. |
Request for Question Clarification by
mathtalk-ga
on
23 Nov 2002 23:26 PST
Hi, mike50-ga:
Thanks for posting this very interesting question. I believe that
your question has two distinct parts:
(a) Describe a good linear time algorithm to check if a directed graph
G is semi-connected, i.e. between any two vertices u,v of G, there
exists either a path from u to v or one from v to u (or both).
(b) Implement a working version in C/C++.
I am interested in answering your question, but I believe that to
answer it well, your question will require more time and effort than
the average amount of time and effort associated with the listed
price. Here is a link to guidelines about pricing your question:
https://answers.google.com/answers/pricing.html
You have some ideas in mind about part (a), but I'm not clear about
what exactly you propose. The approach I have in mind involves one
DFS and one topological sort, and not much else (some "bookkeeping"
computations).
For the list price offered I would gladly provide a clear solution to
(a), perhaps a little simpler than what you were thinking, but in any
case having time complexity O(|V| + |E|). The C++ library stuff you
found at boost.org seems a great start on implementation, but a full
response to part (b) would ideally involve more than just outlining
the steps to be taken.
Here's what I'd like you to clarify:
1. Your thoughts on the algorithm to use, and value if any to my
giving one that is simpler (easier to implement) if possible within
the linear time constraint.
2. Regarding implementation, the preferred platform/compiler, size of
inputs to be handled (you specify a simply yes/no in output), and
desired time frame.
If you review your list price and respond to my clarification request
(essentially letting me know if I'm reading too much or too little
into the tasks), I'll be notified and will quickly take another look
at your problem.
regards, mathtalk-ga
|
Clarification of Question by
mike50-ga
on
24 Nov 2002 08:03 PST
Thanks for the quick response. I am in need of simply a "test
program" to test a given input quickly to determine it it's
semi-connected. So I am just looking for the answer to part <b> if
that will help. Don't worry about the analysis part at all, and the
efficency is not a big deal either. I would just like to play around
with a working version in C/C++ where you just plug in a set of
edges/vertices or something, then it gives you a simple T/F answer to
the question. Please don't spend anymore time analyizing the problem,
I believe that your idea will work perfectly, I've spent tons of time
thinking about this and just need a basic program to test out. "The
approach I have in mind involves one DFS and one topological sort, and
not much else(some "bookkeeping" computations)." I also believe that
it can be implemented with "strongly connected components algorithm"
then some modification also. But still, I am just looking for a quick
test program that will compile and test out a few "input files of
graphs."
|
Clarification of Question by
mike50-ga
on
24 Nov 2002 08:33 PST
As far as input/output goes, I really don't care how it is displayed.
For example I figured that the program would start, look for an
input.txt file:
5
0 1 4 -1
1 4 4 0 2 -1
2 1 3 -1
3 2 4 3 -1
4 1 1 3 0 -1
that specifies the graph (just an example, feel free to do whatever's
easiest for you).
Then it will run the algorithm and output "anything" that will let you
know if its semi-connected. Please feel free to do whatever will
work!
|
Hi, mike50-ga:
I did an implementation which is not linear-time but is something
between quadratic and cubic in the number of vertices. I think this
will be fast enough for your purpose, and it makes for a very
self-contained program.
Here's the code I wrote for a VS.Net C++ "console app"; let me know if
you have questions about making the changes to compile it for another
platform.
The MS style is to use an application header file where the "standard
includes" go, in this case stdafx.h includes stdio.h for us:
[begin stdafx.h]
// stdafx.h : include file for standard system include files,
// or project specific include files that are used frequently, but
// are changed infrequently
//
#pragma once
#define WIN32_LEAN_AND_MEAN // Exclude rarely-used stuff
#include <stdio.h>
#include <tchar.h>
// TODO: reference additional headers your program requires here
[end stdafx.h]
Here is the main program:
[begin semiconn.cpp]
// semiconn.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <vector>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
printf("Enter # of vertices:");
int v;
if (scanf("%d", &v) < 1)
{
printf("Invalid # of vertices entered.\n");
return 0;
}
if (v < 2)
{
printf("# of vertices must be at least 2.\n");
return 0;
}
// create a square matrix of dimension v by v
vector < vector<int> > m1;
m1.resize(v);
for (int i=0; i<v; i++)
{
m1[i].resize(v);
}
// initialize m1 as the identity matrix
for (int i=0; i<v; i++)
for (int j=0; j<v; j++)
{
m1[i][j] = (i==j);
}
// read the list of 'out' edges for each vertex
printf("For each vertex, list destinations (end with -1):\n");
for (int i=0; i<v; i++)
{
printf("vertex %d : ", i);
int j;
while(scanf("%d", &j))
{
if (j<0) break;
m1[i][j] = 1;
}
}
// m1[i][j] tells if there exists path i to j of no more than 1
edge
// a minimal path could be is v-1 edges
// this raises matrix m1 to this power (or more) by repeated squaring
int n = 1;
while (n < (v-1))
{
for (int i=0; i<v; i++)
for (int j=0; j<v; j++)
{
if (m1[i][j]) continue;
for (int k=0; k<v; k++)
{
if (m1[i][k] && m1[k][j])
{
m1[i][j] = 1;
break;
}
}
}
n *= 2;
}
// now we have all the path connections we ever will have
// test for semi-connectedness
int flag = 1;
for (int i=0; i<v; i++)
{
for (int j=i+1; j<v; j++)
{
if (!(m1[i][j] || m1[j][i]))
{
flag = 0; // no path from i to j or from j to i
break;
}
}
if (!flag) break;
}
// print the test result in flag
if (flag)
printf("\nThe digraph is semi-connected.\n");
else
printf("\nThe digraph is not semi-connected.\n");
return 0;
}
[end semiconn.cpp]
I tested the program with your sample data, which can be entered
interactively or by redirection from a test file. It is not necessary
to include the "source" vertex on the edge entry lines, as you appear
to do, but as this is interpreted as a self-edge, it does no harm.
The program says your example is semi-connected.
I'll do a bit more testing, and while I would like to believe my code
and comments are clear and understandable, I will not be shocked to
learn that some clarification is desirable.
regards, mathtalk-ga |
Request for Answer Clarification by
mike50-ga
on
25 Nov 2002 07:56 PST
I got it working after a few bugs, it runs nice. The only thing is,
the way I have the rest of my application set up, I wish to use only
an adjacency list (link-list) representation. I was hoping to stay
away from using the matrix implementation. I there any way to convert
to link list and still be able to test the graph? I realize its much
easier to see with the matrix though. I'll try and play around it
things for a little bit, then let you know if I have any questions.
Thanks again!
|
Clarification of Answer by
mathtalk-ga
on
25 Nov 2002 08:00 PST
-- COMMENT LINE WRAPPING --
I see one comment line in semiconn.cpp that "wraps" in the window,
putting the word "edge" by itself on the next line. Naturally this
will give a syntax error when you compile. I tried to keep the
comments brief to avoid this, but that one slipped through.
BTW, I tried (successfully) a simple example, 1 --> 2 <-- 3 , to test
the logic for a digraph which is not semi-connected.
If M is the "incidence" matrix for the directed graph (not necessarily
symmetric), then information about paths of varying lengths from one
vertex to another is encoded by:
I + M + M^2 + M^3 + . . .
where the entries of M^n count the number of distinct paths from a
given source to a given destination. What I did above is to start
with I + M and repeatedly square it (keeping positive entries simply
as "boolean" 1).
If space were an issue we could use the specialized class vector<bool>
rather than vector<int> for the underlying matrix rows. However the
time complexity would probably begin to cause problems before space
issues become critical, so I went with the integer representation.
There may be some Standard Template Library (STL) issues with another
compiler/platform, as all vendors seem to do things a little
differently. But the vector template is about as vanilla as it gets,
and we aren't using it for anything fancy here (just as a dynamic
multi-dimensional array). For what its worth, Visual Studio's
Intellisense seemed to already "know" about vectors but the compiler
complained unless I used the std namespace.
-- mathtalk-ga
|
Clarification of Answer by
mathtalk-ga
on
25 Nov 2002 08:06 PST
Talk about clarifications that cross in the mail!
I think the code would convert rather easily to an "adjacency" link
list; that's actually all the matrix is encoding. If you post your
class definition for the link list, I'll be happy to take a look.
Thanks again for posting such an interesting problem.
-- mathtalk-ga
|
Request for Answer Clarification by
mike50-ga
on
25 Nov 2002 17:20 PST
Ok, i'm working on implementing it into my program, but I am just
having a few minor problems, since I am using just ".c"
(1) Below is your code, that I just did a few things to, it compiles
fine in Visual C++. Could you simply fix the declarations so it works
in straight c. It gives an error like "eh.h is for c++, not c."
Sorry to bother you, but this will be the last.
#include <stdio.h>
#include <vector>
using namespace std;
int main() {
printf("Enter # of vertices:");
int v;
if (scanf("%d", &v) < 1)
{
printf("Invalid # of vertices entered.\n");
return 0;
}
if (v < 2)
{
printf("# of vertices must be at least 2.\n");
return 0;
}
// create a square matrix of dimension v by v
vector < vector<int> > m1;
m1.resize(v);
for (int i=0; i<v; i++)
{
m1[i].resize(v);
}
// initialize m1 as the identity matrix
for (i=0; i<v; i++)
for (int j=0; j<v; j++)
{
m1[i][j] = (i==j);
}
// read the list of 'out' edges for each vertex
printf("For each vertex, list destinations (end with -1):\n");
for (i=0; i<v; i++)
{
printf("vertex %d : ", i);
int j;
while(scanf("%d", &j))
{
if (j<0) break;
m1[i][j] = 1;
}
}
// m1[i][j] tells if there exists path i to j of no more than 1
edge
// a minimal path could be is v-1 edges
// this raises matrix m1 to this power (or more) by repeated squaring
int n = 1;
while (n < (v-1))
{
for (int i=0; i<v; i++)
for (int j=0; j<v; j++)
{
if (m1[i][j]) continue;
for (int k=0; k<v; k++)
{
if (m1[i][k] && m1[k][j])
{
m1[i][j] = 1;
break;
}
}
}
n *= 2;
}
// now we have all the path connections we ever will have
// test for semi-connectedness
int flag = 1;
for (i=0; i<v; i++)
{
for (int j=i+1; j<v; j++)
{
if (!(m1[i][j] || m1[j][i]))
{
flag = 0; // no path from i to j or from j to i
break;
}
}
if (!flag) break;
}
// print the test result in flag
if (flag)
printf("\nThe digraph is semi-connected.\n");
else
printf("\nThe digraph is not semi-connected.\n");
return 0;
}
|
Clarification of Answer by
mathtalk-ga
on
25 Nov 2002 21:08 PST
Hi, mike50-ga:
Unfortunately this is not a matter of changing a declaration or two.
The C++ program I wrote at your request depends specifically on the
vector< > template from STL. For example, the dynamically allocated
multidimensional "array" m1.
To duplicate this functionality in C is possible, but it might not be
the best approach for your purposes. Please allow me to suggest two
alternatives:
1) Recall the "adjacency" link-list which you said you are using
elsewhere in your program. If you want the algorithm to be framed in
terms of that "C" data structure, it would perhaps be ultimately more
satisfactory to you to do it that way (instead of doing a more or less
literal "write-down" of my C++ code into C)?
2) If you are using C code for most of your program, you can still
compile a C++ file containing a subroutine with "C linkage" and call
that from the rest of your program modules written in C. Little of my
C++ code would need to be rewritten in that scenario; one can simply
rename the "main" routine to something like "semiconn" and declare it
thusly:
extern "C" int semiconn(int v, char *fname)
which will cause the Visual C++ compiler to create an "unmangled"
internal name for the function, suitable for letting a C compiled
routine call it.
In this way one can (carefully) mix C and C++ code in a program.
regards, mathtalk-ga
|
Request for Answer Clarification by
mike50-ga
on
25 Nov 2002 21:34 PST
If it will help, I can email you the project code that I am working
on, so you can see what my class definitions and structures are. It
is pretty large, and I would rather not post it here. If you choose,
you can email me at miket513@hotmail.com and I will forward it to
you. Also, since this requires extra work, I can either create a new
question, or add to the tip (probably easier). Once again, thanks for
your help.
|
Clarification of Answer by
mathtalk-ga
on
25 Nov 2002 22:17 PST
Hi, mike:
Thanks for the kind comments, it means a lot. Note that the GA Terms
of Service are pretty strict about direct contacts between researchers
and customers. If it becomes necessary to exchange files, the usual
way this is handled is by putting up a link to a Web site, rather than
email.
Here are some quick notes before I have to sign-off for tonight:
There are two common ways to "simulate" a dynamic multidimensional
array in C.
1) One can typedef an array of "pointers to int", then use malloc to
allocate space for such an array of pointers, as well as space for
each individual row (whose address is assigned to the corresponding
entry in the array of pointers). This allows one to use the m[i][j]
syntax.
2) One can simulate the multidimensional array with a single
dimensional array; this is much the simplest approach but forces us to
abandon the syntax m[i][j] for the more awkward and explicit m[i*v +
j].
Neither approach does much violence to the "structure" of the program
I wrote, but both require some rework. Let me sleep on this, and
return to the fray in the morning.
regards, mathtalk-ga
|
Clarification of Answer by
mathtalk-ga
on
26 Nov 2002 19:18 PST
Hi, mike50-ga:
Here is the vanilla "C" version of the program. Thanks for taking
time to rate my answer!
[begin semiconn.c]
// semiconn.c : redefines the entry point for the console application.
// Nov. 26, 2002 changed from C++ to C source per mike50's request
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
int main(int argc, char* argv[])
{
int i,j,k,n,v,**m1;
printf("Enter # of vertices:");
if (scanf("%d", &v) < 1)
{
printf("Invalid # of vertices entered.\n");
return 0;
}
if (v < 2)
{
printf("# of vertices must be at least 2.\n");
return 0;
}
// create a square matrix of dimension v by v
m1 = (int **)malloc(v*sizeof(int *));
for (i=0; i<v; i++)
{
m1[i] = (int *)malloc(v*sizeof(int));
}
// initialize m1 as the identity matrix
for (i=0; i<v; i++)
for (j=0; j<v; j++)
{
m1[i][j] = (i==j);
}
// read the list of 'out' edges for each vertex
printf("For each vertex, list destinations (end with -1):\n");
for (i=0; i<v; i++)
{
printf("vertex %d : ", i);
while(scanf("%d", &j))
{
if (j<0) break;
m1[i][j] = 1;
}
}
// m1[i][j] tells if path (i to j) of at most 1 edge exists
// a minimal path is at most v-1 edges
// raises matrix m1 to this power or more by repeated squaring
n = 1;
while (n < (v-1))
{
for (i=0; i<v; i++)
for (j=0; j<v; j++)
{
if (m1[i][j]) continue;
for (k=0; k<v; k++)
{
if (m1[i][k] && m1[k][j])
{
m1[i][j] = 1;
break;
}
}
}
n *= 2;
}
// now we have all possible path connections
// test for semi-connectedness, using flag n
n = 1;
for (i=0; i<v; i++)
{
for (j=i+1; j<v; j++)
{
if (!(m1[i][j] || m1[j][i]))
{
n = 0; // no path (i to j) or (j to i)
break;
}
}
if (!n) break; // one failure suffices
}
// print the test result of flag n
if (n)
printf("\nThe digraph is semi-connected.\n");
else
printf("\nThe digraph is not semi-connected.\n");
// free the memory allocated for m1
for (i=0; i<v; i++)
{
free(m1[i]);
}
free(m1);
return 0;
}
[end semiconn.c]
regards, mathtalk-ga
|