View Question
Q: MTBF for a system with several subsystems with individual MTBFs ( Answered ,   7 Comments )
 Question
 Subject: MTBF for a system with several subsystems with individual MTBFs Category: Computers Asked by: ravil-ga List Price: \$10.00 Posted: 19 Aug 2004 16:23 PDT Expires: 18 Sep 2004 16:23 PDT Question ID: 390140
 ```I need to find out the MTBF of an appliance that we manufacture. The appliance consists of several subsystems - motherboard, powersupply, fans, LCD etc. If I know the MTBF of each of these subsystems, how do I go about calculating the MTBF for the appliance. I need links to tutorials that describe this scenario in a clear and concise manner.```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs Answered By: maniac-ga on 21 Aug 2004 10:16 PDT Rated:
 ```Hello Dhrm77, A complete answer will take into account both: - serial failures - parallel failures (or redundant components) There are different formulas for these two basic scenarios which can be combined to produce an accurate system level MTBF result. Serial Failures: Let's do the serial failure case first. In this case, the failure of any one component will cause failure of the system. This may sometimes be illustrated as: A -> B -> C -> D to show the serial nature of the failure analysis. The comment by jeannot52 is basically correct for this case (but does not address the case with redundant components). However I find it much easier to describe (and compute) by introducing failures per million (lambda). For the remaining part of the answer: MTBF = mean time between failures (hours per failure) Lambda = failures per million hours F = failure rate or probability of failure in one hour R = reliability rate (probability of working in one hour) You can convert between MTBF and Lambda with the following equations: Lambda = 1,000,000 / MTBF MTBF = 1,000,000 / Lambda and assuming a constant failure rate (not necessarily true) F = Lambda / 1,000,000 or 1 / MTBF R = 1-F The reason I talk in terms of Lambda (failures per million hours) is you can still explain the comparisions - less failures per million is better, and the arithmetic is simpler (until you *have* to compute MTBF). So if you have four components, A, B, C, and D each with MTBF of 20,000, 10,000, 15,000, and 30,000 hours respectively, using this method, the MTBF of the system is calculated as: Lambda A = 1,000,000 / 20,000 = 50.0 Lambda B = 1,000,000 / 10,000 = 100.0 Lambda C = 1,000,000 / 15,000 = 66.67 Lambda D = 1,000,000 / 30,000 = 33.33 Lambda (composite system) = 50+100+66.67+33.33 = 250 MTBF (composite system) = 1,000,000 / 250 = 4,000 hours This is a relatively simple example. A system 10's or hundreds of components can be calculated in the same way. Redundant Components: If you have two components in parallel (e.g., dual power supplies) where a failure of both components is required to fail the system, the MTBF of the system is MUCH less than either component. I will do a simple example using both serial and parallel failures. Assume A and B both have MTBF of 100 hours or Lambda = 10,000. The failure rate F for A and B would then be 0.01 for each. For comparison, the serial solution has Lambda = 20,000 failures per million or MTBF = 50 hours. For the redundant case, the probability (F) that both items are failed at the same time is: F = FA * FB F = 0.01 * 0.01 F = 0.0001 Solving for lambda gets Lambda = 100 or MTBF = 10,000 hours So there is a substantial improvement in reliability when using redundant components. Note that if you have serial components before / after the redundant components, you still need to handle those in series with the redundant components. A composite system: If you have both serial / parallel components, break up the system into pieces and do the lambda calculations as serial or parallel. I usually end up with several serial items to add at the end and then compute the overall system MTBF value. A few other sources or books include: http://www.softwareresearch.net/site/teaching/SS2003/PDFdocs.EmbC/16_fault_tolerance.pdf Part of a training class. Has the formulas and illustrations about 1/2 the way down the file. Also includes some historical information and explanation of the different causes of failure (e.g., mechanical) and why failure rates can vary over time. Note this uses R for most of the calculations. http://www.alericonetworks.com/support/analysis/harddisk/ An explanation of hard disk reliability and how different ways of organizing the disks can affect overall reliability as well as how reliability varies with time. http://www.intellectuk.org/publications/relc.asp A reliability guide from an UK industry group - price listed at 45 pounds. See the table of contents / introduction link for a good summary of what is covered. http://www.fetchbook.info/search_1587130173/tab_reviews.html A book describing networks - but includes sections on computing reliability with the various methods. Also addresses several issues related to high availability. You can apparently download the "SHARC Spreadsheet" from a link at the end of http://safariexamples.informit.com/1587130173/index.htm for more general calculations. http://www.enre.umd.edu/rmp.htm Links to several reliability analysis programs with brief descriptions. See also the links near the bottom for other resources on this university site. http://www.bmtrcl.com/reliability_calculator.html A simple reliability calculator - converting to / from failures per million hours (or failures per hour) to MTBF in hours or years. Search phrases included: lambda failure per million hours RMA reliability formula "reliability calculator" mtbf reliability tutorial If you have a lot of these reliability calculations to perform - I strongly suggest getting a tool. Perhaps start with the SHARC spreadsheet and then move on to a more comprehensive tool later. Please use a clarification request if you need further details on performing the reliability calculations or if some part of the answer is unclear. --Maniac```
 ravil-ga rated this answer:

 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: dhrm77-ga on 19 Aug 2004 20:34 PDT
 ```Assuming that the design is well within all the recommended operating conditions for each component, the over-all MTBF is the lowest MTBF of all components.```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: msblack-ga on 20 Aug 2004 00:10 PDT
 ```That doesn't sound correct. Probabilities multiply. For example, if a part has a 10% failure rate within one year, two identical parts have a failure rate (either one or both failing) of 19%.```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: msblack-ga on 20 Aug 2004 07:36 PDT
 ```Taken another way: MTBF is Mean Time Between Failure. That translates into the part has a 50% chance of surving that length of time. If A has an MTBF of 1 year and B has an MTBF of 1 year, either part has a 50% chance of working after one year. Taken together, there's only a 25% chance that both parts will be working after one year.```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: ddh-ga on 20 Aug 2004 12:38 PDT
 ```Assuming you have 3 components A,B,C. Assume failure rate within a period (let's say 1 year) of component A,B,C are X%, Y%,Z% respectively. So, for component A not to fail during the one year is (100-X)%. Similarly, for Componet B not to fail during the one year is (100-Y)% Also, for Componet C not to fail during the one year is (100-Z)% Therefore for all the 3 components not to fail is (100-X)/100 x (100-Y)/100 x(100-Z)/100 = ((100-X)/100 x (100-Y)/100 x(100-Z)/100)x100% Also for at least 1 component to fail within the period is (100 - ((100-X)/100 x (100-Y)/100 x(100-Z)))/100)% You know the percentage it will fail during the fixed period (1 year), so you can use this to work out MTBF. (I don't know how to work out the MTBF from the failure rate) This can also be ued for 4, 5, 6 and many more componenets.```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: ravil-ga on 20 Aug 2004 14:25 PDT
 ```All I need is to provide this kind of information for our appliance: MTBF at 40 degree C is xxx,xxx hours How have others got this information knowing the MTBFs of the subsystems.```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: jeannot52-ga on 20 Aug 2004 21:39 PDT
 ```The general answer to your question is that the reciprocal of the MTBF of an entire system is equal to the sum of the reciprocal of all its subsystems : ie: 1/MTBF(total)=Sum[1/MTBF(subcomponent1) + 1/MTBF(subcomponent2) + etc....). You can, thus, easily derive the system MTBF after adding a few fractions and taking the inverse of the resulting fraction. The trick is to make sure that the each of the subsystems' MTBF is known (especially, if only an estimate is provided, you must check that the assumed values are using the same hypotheses, ie working condidtions such as temp, pressure, duty cycle, aso...). Best of luck and best regards. jeannot52```
 Subject: Re: MTBF for a system with several subsystems with individual MTBFs From: dhrm77-ga on 24 Aug 2004 15:43 PDT
 ```Ooops... I was just thinking about it again.. and yes I made a mistake. I was thinking in term of the weakest link...and didn't realize that we were talking about averages. yes, probabilities do multiply.. The answer isn't really simple, however. Let's say we have 2 systems, if both systems have a MTBF of 5 years. that means 1/2 of each system will have failed after 5 years. so we have 1/4 of the systems working. the problem is what exactly does the failure curve look like ? is it a straight line ? is it a curve where 1/2 of the remaining systems break down after another 5 years ? If the failure rate follows an arithmetic curve, then the over-all MTBF would be 0.707 x 5 years or about 3.5 years. It could be designed in such a way that 99% of the systems will work correctly for 4 years, then within the next 2 or 3 years 80% of those system will experience some failure... A typical example is a car. They work fine for the first few years.. then you keep having to replace some parts.. That makes the over-all MTBF very hard to calculate. Furthermore, as "maniac" mentionned above, there are scenarii with serial failures. I had one of those recently where a hole drilled in the wrong place, went through a power supply, without simingly affecting it. The cracked PC board allowed moisture to start corroding copper traces. when the trace was gone, the power supply didn't regulate anymore, which activated the protection cicuitry, which started to draw power from nearby power supply, over-burdening and burning up the over-current protection, and starting a fire. But it really doesn't matter.. once a component has failed.. whether or not some other components fails too in the process is sometimes irrelevant, except if "backup" sub-systems will compensate for the failing component... As I read above, any attempt to calculate an MTBF assumes that the known failure rates of each component is linear or arithmetic with time... which is not necessarily the case.```