Google Answers Logo
View Question
 
Q: Designing for Scalability: Java, MySQL, and Linux ( No Answer,   7 Comments )
Question  
Subject: Designing for Scalability: Java, MySQL, and Linux
Category: Computers > Programming
Asked by: memphisblues-ga
List Price: $50.00
Posted: 18 May 2004 08:37 PDT
Expires: 20 May 2004 04:41 PDT
Question ID: 348207
Google Researchers -

I'm building a web application that stores documents for a small local
business.  The initial storage space required will not exceed 80GB and
therefore, one server will be sufficient.  However, I know that the
company will eventually need to store more than this amount, and I'd
like to design the software to handle it initially.

The software stack will use Apache, Tomcat, and MySQL on Linux.  I
have software development experience, but not regarding clustering and
the resulting design considerations.  I would like some advice on how
to design my software.  I'm also not as familiar with Linux, so please
provide simple terminology.

Specifically, how can I store references to files (documents) so that
adding new servers (or drives) will be seamless?  If I were storing
all documents on one server, every time a user adds a file, I would
upload the file to the user's directory, renaming the file to
something unique: /home/john-doe/uniquename001.doc.  In the software,
I'm planning for every user will have their own directory, but if you
have a better idea, I am open to ideas.

My problem: if I add a new server, how do I store which server the
user's file are on?  Should I instead have one file server and add
drives (if so, how do I store which drive/partition a user is located
on - this may be a dumb Linux question, but again, I'm used to c:\ and
d:\ or Windows)?

In summary, I'm looking for a solid answer regarding a scalable 
architecture and software design that would accommodate a growing user
and document base.  Also, I would like the names and contact
information for people/groups that I may hire to help me implement
these suggestions (e.g. consulting groups specializing in building
Java-based scalable software).  Focus should be on true scalability
and cost effectiveness - starting out with one server, building into
multiple.  I don't want to spend thousands of dollars per month (I'm
hosting with Rackspace) starting out, but would be willing to if the
client base required it.

Thank you,
Josh
Answer  
There is no answer at this time.

Comments  
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: tobascus-ga on 19 May 2004 04:43 PDT
 
I am not a Google Researcher but I would like to answer/comment your questions.

Let me introduce my self first, I am working in the software
development industry for last 8 years and I had been part of many
commercial projects on different platforms. I have led projects and
served as an architect in different projects. Hopefully you would find
my answer practical and interesting.
 
Question:
=========
The software stack will use Apache, Tomcat, and MySQL on Linux.  I
have software development experience, but not regarding clustering and
the resulting design considerations.  I would like some advice on how
to design my software.  I'm also not as familiar with Linux, so please
provide simple terminology.

Comment:
========
Architecture and design are considered to be the solution to technical
and functional requirements of a project/problem.

For example one of yours client requirements would be "Cost
Effectiveness of the Solution" and in answer to that you chose these
reliable and freeware softwares for development and deployment.

What I want to say is that I can only advise you a useful and
practical solution (in terms of Architecture and Design) if I have
more details of project requirements. Sometimes we discuss the basic
idea only but this is not enough because there are always important
client's concerns and constraints.
  
I can suggest you many sites on the same topic of architecture and
design (I am sure you have already searched that) but in my opinion
these would not help you much. It is my experience that each
customized project should be treated specifically according to
specific client requirements.

Question:
========
Specifically, how can I store references to files (documents) so that
adding new servers (or drives) will be seamless?  If I were storing
all documents on one server, every time a user adds a file, I would
upload the file to the user's directory, renaming the file to
something unique: /home/john- doe/uniquename 001.doc.  In the
software, I'm planning for every user will have their own directory,
but if you have a better idea, I am open to ideas.

My problem: if I add a new server, how do I store which server the
user's files are on?  Should I instead have one file server and add
drives (if so, how do I store which drive/partition a user is located
on - this may be a dumb Linux question, but again, I'm used to c:\ and
d:\ or Windows)?

Answer:
=======
Consider these scenarios

1) One server machine with one drive
2) One server machine with multiple drives
3) Multiple servers with one drive each
4) Multiple servers machine with multiple drives

In my experience to address all above four scenarios, following should be done

1) Software to be developed should be able to register server machinea
with following parameters

         (1) Server machine name/IP (So that it can be accessed)
         (2) Admin Id to access all the drives on the server machine
         (3) Admin password to access all the drives on the server machine
         (4) Drives with symbol (e.g. C: or D:) and name (e.g. my local
             drive) 

2) Software should be able to register all the server machines with
available drive's information

3) This all information should go to a database for reference

Now when a user submits a file for storing it.

Then software should be able to look for the available space one each
server one by one in each drive and store the file on an available
server, with available drive and with available space.

If a file "FirstFile.Doc" is save to a server "Server4" on "E:" drive
for application "DocumentsDataStore" for user "Josh" then address
stored in the database can be as following

Server4 Machine Address
File Path
File unique id (generated by system)
File saved by
Files name 
File description 
Etc

Path could be 

"\\Server4\E$\DocumentDataStor\Josh\FirstFile.Doc"

Whenever user needs to open a stored file then software access that
file from above mentioned path. Since there would be a database table
which records each user files paths with access rights information, so
this way user would be able to access their files using stored
information.

If I talk in terms of Java then using servlets functions I can
retrieve and store files on registered servers.

Question:
=========
In summary, I'm looking for a solid answer regarding a scalable
architecture and software design that would accommodate a growing user
and document base.
Also, I would like the names and contact information for people/groups
that I may hire to help me implement these suggestions (e.g.
consulting groups specializing in building Java-based scalable
software).  Focus should be on true scalability and cost effectiveness
- starting out with one server, building into multiple.  I don't want
to spend thousands of dollars per month (I'm hosting with Rack space)
starting out, but would be willing to if the client base required it.

Answer:
=======
You can find a solid answer (technical solution) only if you provide
details of requirements and perform proper software development
Phases. For this project I suggest you to stick to following software
development life cycle.

Development Approach:

In my opinion every project takes it due time to be stable no matter
what process or technique you adopt and many times wrong project
management techniques fail the project and frustrate the stake
holders. In my experience one should follow the strict rules of
Software Development Life Cycle and this would make the life easy for
each party. Quality of deliverables would be extra ordinary at each
stage. I suggest you to follow the following software life cycle in
strict manner.

Requirement Analysis:
Analysis experts should perform a through research to enhance your
business idea further and prepare detailed requirement specifications.
Reading this document you should know exact details of application
features. You should only approve that document if it satisfies your
requirements. This document should be updated against your comments
until and unless you are satisfied.

Architecture Definition:
Based on the requirement analysis, architecture should be adopted,
which would include details of software and hardware used for the
project with other details

Prototyping:
Graphics designer should prepare a comprehensive prototype for the
requirements and that prototype would also be sent to you for
functional and graphics approvals. You should approve prototype if it
satisfies your requirements. This prototype should be updated against
your comments until and unless you are satisfied.

Design Phase:
Get designed whole application before you actually get coded
something. Design your application and get performed multiple
technical reviews by third party to enhance quality of the
application. Design is the most important step, which is often ignored
for small projects. This causes many problems in future.

Coding Phase:
Code your application and get performed multiple technical reviews to
enhance quality of the application.

Testing Phase:
Perform thorough functional and technical testing of the application
to ensure stability in the end product.

If I could help you more, I would be happy to do that

Regards,
Tobascus
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: tobascus-ga on 19 May 2004 04:54 PDT
 
Also visit question 341308 and read comments 
http://answers.google.com/answers/threadview?id=341308

Also visit question 320155 and read comments
http://answers.google.com/answers/threadview?id=320155

You would fine more useful information which would help you out
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: tobascus-ga on 19 May 2004 04:57 PDT
 
under question (in comments)
http://answers.google.com/answers/threadview?id=320155

There are some helpful links to solve your problems.
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: memphisblues-ga on 19 May 2004 08:27 PDT
 
tobascus-ga:

First, thank you for taking time to write your thoughts.  I have a
good understanding of developing software (and have been doing it
professionally for years).  I'm not looking for development
methodology.  My question is a very specific one: what are the options
for scaling a website, and how would one design/program software to
account for it?  E.g. DAS, NAS, or SAN are all options for the storage
scalability I require - how would I need to design the upload/store
portion of my application to handle these options.

I don't think that hard-coding the server's IP address and checking
server storage capacity at runtime is a viable option.  What happens
if the IP changes?  What happens if that server goes down?  I
understand that this could work, but I'm really more interested in
hearing from people who have encountered the problem I'm discussing,
or in obtaining contact information of people who have.

Again, thank you for the overview of software methodology, but I'm
really looking for information on a system/network architecture that
would support the storage scalability that I mentioned, and how the
software would accomodate this architecture.
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: tobascus-ga on 19 May 2004 10:57 PDT
 
Thanks for the clarifications; now my understanding of your question
is better than before. I would  give you a more satisfactory and
precise  answer in a day or two after discussing this problem with my
fellow IT/ Networking professionals.
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: tobascus-ga on 19 May 2004 22:11 PDT
 
I would forward you contacts of professionals (e.g. consulting groups
specializing in building Java-based scalable software) who would be
able to help you implement this project but I want to know your
expected budget for this project and timelines.
Subject: Re: Designing for Scalability: Java, MySQL, and Linux
From: tobascus-ga on 19 May 2004 23:25 PDT
 
These links might also help you regarding the architectural issues of
such application

1)
Enterprise Volume Management System:
====================================
Enterprise Volume Management System (EVMS) is a management tool. It
gives you a choice of nice friendly interfaces and runs the
appropriate volume managers, software RAID, file system tools,
partition managers, etc on the backend. It is extremely cool and
allows you to easily and SAFELY perform most tasks related to volume
management and filesystem maintenance. While messing with partitions,
filesystems, etc can be very scary for the uninitiated, evms gives a
level of protection and performs the appropriate backend functions in
the correct order and only allows actions that make sense.

EVMS Introduction: http://evms.sourceforge.net/

EVMS 2.0 Architecture Overview: http://evms.sourceforge.net/architecture/

EVMS Cluster Design Document version 2.0 :  http://evms.sourceforge.net/clustering/

2) Logical Volume Manager (LVM):
================================
http://www.sistina.com/products_lvm.htm

Logical Volume Manager (LVM) is a storage virtualization tool. It
basically gives you the ability to take a bunch of different storage
devices (different sizes, types, speeds) and glom them together into
large areas of virtual storage, and the divide it up arbitrarily to
fit your needs. LVM is part of the linux kernel and has been for some
time. Even if you haven't used it, you have probably seen the option
to use it if you have installed Linux recently.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy