teshimaryokan.info Biography SIX SIGMA STATISTICS WITH EXCEL AND MINITAB PDF

Six sigma statistics with excel and minitab pdf

Sunday, July 7, 2019 admin Comments(0)

Six Sigma Statistics with EXCEL and teshimaryokan.info - Free ebook download as PDF File .pdf) or read book online for free. E2KBS47QERMQ «PDF \\ Six Sigma Statistics with EXCEL and MINITAB ( Hardback) Statistics with Excel and Minitab o ers a complete guide to Six Sigma. This is a relied on area to have Six Sigma Statistics With Excel And Minitab by reading online in rar, word, pdf, txt, kindle, zip, and ppt. six sigma basics - mit.


Author: JERE JAYROE
Language: English, Spanish, Arabic
Country: Iraq
Genre: Religion
Pages: 480
Published (Last): 03.12.2015
ISBN: 418-3-48078-316-1
ePub File Size: 15.86 MB
PDF File Size: 16.21 MB
Distribution: Free* [*Regsitration Required]
Downloads: 38552
Uploaded by: TERRY

Six Sigma Statistics with Excel and Minitab Issa Bass New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul. six sigma statistics with excel and minitab. Fri, 07 Dec GMT six sigma statistics with excel pdf - Copyright. © ― by. teshimaryokan.info Were you trying to find Six Sigma Statistics With Excel And Minitab by Learning as pdf, kindle, word, txt, ppt, rar and/or zip document on this.

It helps detect assignable causes of variations and facilitate corrective actions. In other words, it helps assess the sources of variation that can be linked to the independent variables and determine how those variables interact and affect the predicted variable. TABLE 3. You are my hero. The dialog box shown in Figure 2.

Open a new Minitab worksheet. Storing the square root of C1 in C4. Summing the three columns and storing the results in C5. To add up the columns, the user has two options.

One option is to complete the dialog box as indicated in Figure 2. An Overview of Minitab and Microsoft Excel 31 4. Finding the medians for each row and store the results in C6. Remember that we are not looking for the median within an individual column but rather across columns.

Finding all the cells in column C1 that have greater values than the ones in column C2 on the same rows. The zeros represent the cells in C1 whose values are lower than the corresponding cells in C2. Help menu. Minitab has spared no effort to build solid resources under the Help menu. It contains very rich tutorials is and easy to access. The Help menu can be accessed from the menu bar but it can also be accessed from any dialog box.

The Help menu contains examples and solutions with the interpretations of the results. Example The user is running a regression analysis and would like to know how to interpret the results. To understand how the test is conducted and how to interpret the results, all that must be done is to open a Regression dialog box and select the Help button.

The tutorials contain not only an overview of the topic but also practical examples on how to solve problems and how to interpret the results obtained.

The StatGuide appears under two windows: MiniGuide and StatGuide. The MiniGuide contains the links to the different topics that are found on the StatGuide. Like any other spreadsheet software, it is used to enter, organize, store, and analyze data and display the results in a comprehensive way.

Most of the basic probabilities and descriptive statistical analyses can be done using built-in tools that are preloaded in Excel. In this book, we will only use the capabilities built into the basic Excel package. This box contains more than just statistical tools.

The forms that this dialog box takes depend on the category selected, but areas of the different parts can be reduced to 5. The dialog box shown in Figure 2. Details the expected results Defines what should be entered in the active field Links to the help menu for the selected category Displays the results of the analysis Figure 2.

An Overview of Minitab and Microsoft Excel 37 2. These are done through Data Analysis, which is an add-in that can be easily installed from the Tools menu. These options will be examined more extensively throughout this book. It is about how to convert raw numerical data into informative and actionable data.

As such, it applies to all spheres of management. The science of statistics is in general divided into two areas: Population in statistics is just a group of interest. That group can be composed of people or objects of any kind. This chapter is about descriptive statistics.

It shows the basic tools needed to collect, analyze, and interpret data. That value can give a glimpse of the magnitude or the location of a measurement of interest. For instance, when we say that the average diameter of bolts that are produced by a given machine is 10 inches, even though we know that all the bolts may not have the exact same diameter, we expect their dimension to be close to 10 inches if the machine that generated them is well calibrated and the sizes of the bolts are normally distributed.

The single value used to describe data is referred to as a measure of central tendency or measure of location. The most common measures of central tendency used to describe data are the arithmetic mean, the mode, and the median.

Arithmetic mean for raw data. For ungrouped data—that is, data that has not been grouped in intervals—the arithmetic mean of a population is the sum of all the values in that population divided by the number of values in the population: What is the average daily production? TABLE 3. In this case, we elected to store it in C3.

The result will display as shown in Figure 3. Descriptive Statistics: Production Variable Mean production Each team has a different number of workers.

What is the average production per worker during that period? Sometimes the available data are grouped in intervals or classes and presented in the form of a frequency distribution. The data on income or age of a population are often presented in this way. It is impossible to exactly determine a measure of central tendency, so an approximation is done using the midpoints of the intervals and the frequency of the distribution: Example The net revenues for a group of companies are organized as shown in Table 3.

Determine the estimated arithmetic mean revenue of the companies. It is the nth root of the product of n values: It represents the value of the observation that appears most frequently. Consider the following sample measurement: Consider the following set of data: If we rearrange the data in order of increasing magnitude, we obtain: The measures of dispersion or variability provide that information. If the values of the measures of dispersion show that the data are closely clustered around the mean, the mean would be a good representation of the data and a good and reliable average.

Variation is very important in quality control because it determines the level of conformance of the production process to the set standards. For instance, if we are manufacturing tires, an excessive variation in the depth of the treads of the tires would imply a high rate of defective products. The study of variability also helps compare the spread in more than one distribution. Suppose that the arithmetic mean of a daily production of cars in two manufacturing plants is We can conclude that the two plants produce the same number of cars every day.

But an observation over a certain period of time might show that one produces between and cars a day and the other between and The most widely used measures of dispersion are the range, the variance, and the standard deviation. It is the difference between the highest and the lowest values of a data set.

It is not informative about the other values. If the highest and the lowest values in a distribution are both outliers i. The mean deviation, the variance, and the standard deviation provide more information about all the data observed.

Single deviations from the mean for a given distribution measure the difference between every observation and the mean of the distribution. In Table 3. Consider the example in Table 3. Because the sum of the deviations is always equal to zero, it cannot be used to measure the mean deviation; another method should be used instead. The mean deviation is the sum of the absolute values of the deviations from the mean divided by the number of observations in the population.

Example Use Table 3. In other words, on average The variance is the average of the squared deviation from the arithmetic mean. For that reason, we will consider the variance as a transitory step in the process of obtaining the standard deviation.

Using Excel, we must to distinguish between the variance based on a sample ignoring logical values and text in the sample , variance based on a sample including logical values and text , and variance based on a population. We will use the latter. It is the square root of the variance: The variance in that case is noted as s2 and the standard deviation as s. Note that the smaller the standard deviation, the closer the data are scattered around the mean. If the standard deviation is zero, this means all the data observed are equal to the mean.

Given the number k greater than or equal to 1 and a set of n measurements a1 , a2 ,. Example A sample of bolts taken out of a production line has a mean of 2 inches in diameter and a standard deviation of 1.

We cannot compare the standard deviation of the production of bolts to one of the availability of parts. If the standard deviation of the production of bolts is 5 and that of the availability of parts is 7 for a given time frame, we cannot conclude that the standard deviation of the availability of parts is greater than that of the production of bolts, and therefore the variability is greater with the parts.

Example A sample of students was taken to compare their income and expenditure on books. The standard deviations and means are summarized in Table 3. How do the relative dispersions for income and expenditure on books compare? These statistics can help estimate the existence of a relationship between variables and the strength of that relationship. Example Based on the data in Table 3. As x increases, so does y, and when x is greater than its mean, so is y. The covariance is limited in describing the relatedness of x and y.

It can show the direction in which y moves when x changes but it does not show the magnitude of the relationship between x and y.

If we say that the covariance is 2. A better measure of association based on the covariance is used by statisticians. The sign of r will be the same as the sign of the covariance. In other words, an increase in x will lead to a proportional decrease in y. When r equals zero, there is no relation between the variation in x and the variation in y.

Example Given the data in Table 3. Using Excel. When r 2 equals one, the changes in y are explained fully by the changes in x. Any other value of r 2 must be interpreted according to how close it is to zero or one.

For the previous example, r was equal to 0. Histograms, stem-and-leaf, and box plots are types of graphs commonly used in statistics. It enables the experimenter to visualize how the data are spread, to see how skewed they are, and detect the presence of outliers. The construction of a Basic Tools for Data Collection, Organization and Description 63 histogram starts with the division of a frequency distribution into equal classes, and then each class is represented by a vertical bar.

Using Minitab, we can construct the histogram for the data in Table 3. The two sets of data seem to be highly correlated. The stem-and-leaf graph is essentially composed of two parts: Consider the data in Table 3.

The numbers that start with 1 have 7, 8, Basic Tools for Data Collection, Organization and Description 65 8 again, and 9 as the second digits. There are three numbers starting with 4 and all of them have 0 as a second digit. Excel does not provide a function for stem-andleaf graphs without using macros. Stem-and Leaf Display: The plot shows how the data are scattered within those ranges.

The advantage of using the box plot is that when it is used for multiple variables, not only does it graphically show the variation between the variables but it also shows the variations within the ranges. In this case, we have an uneven distribution: The upper quartile will be the median of the observations on the right of 93—therefore, The last step will consist in drawing a rectangle, the corners of which will be the quartiles. The interquartile range will be the difference between the upper and lower quartiles, e.

The IQR measures the vertical distance of the box; it is the difference Basic Tools for Data Collection, Organization and Description 67 between the upper quartile and the lower quartile values. In this case, it will be the difference between and 73, and therefore equal to An outlier is considered extreme when it is away from the closest quartile by more than 3 IQR. We do not have any outliers in this distribution but the graph shows that the data are unevenly distributed with more observations concentrated below the median.

Using Minitab, open the worksheet Box Whiskers. Clicking on any line on the graph would give you the measurements of the lines of interest.

Minitab pdf sigma six and statistics excel with

Figure 3. Not only do box plots show how data are spread within a distribution, they also help compare variability within and between distributions and even visualize the existence of correlations between distributions.

Example The data in Table 3. We want to determine if there is a difference between the two rooms and the level of temperature variability within the two rooms. Open heat. The graphs show that there is a large disparity between the two groups, and for the room with metal the heat level is predominantly below the median. For the room without metal, the temperatures are more evenly distributed, albeit most of the observations are below the median.

The two columns in Table 3. The result should look like that shown in Figure 3. In other words, the number of CXS parts available in the warehouse must always be 50 percent of the inventory at any given time. But the management has noticed a great deal of increase in back orders, and sometimes CXS parts are overstocked. Based on the samples taken in Table 3. Show the means and the standard deviations for the two frequencies.

Determine if there is a perfect correlation between the inventory and part CXS. Determine the portion of variation in the inventory that is explained by the changes in the volume of part CXS. Using Minitab, draw box plots for the two frequencies on separate graphs. Determine the stem-and-leaf graphs for the two frequencies.

Using Minitab and then Excel, show the descriptive statistics summaries. No matter how well-calibrated a machine is, it is impossible to predict with absolute certainty how much part-to-part variation it will generate. Based on statistical analysis, an estimation can be made to have an approximate idea about the results. The area of statistics that deals with uncertainty is called probability. We all deal with the concept of probability on a daily basis, sometimes without even realizing it.

What are the chances that 10 percent of your workforce will come to work late? What is the likelihood that the shipment sent to the customers yesterday will reach them on time? What are the chances that the circuit boards received from the suppliers are defect free?

So what is probability? It is the chance, or the likelihood, that something will happen. It is a number between zero and one. If there is a percent chance that the event will take place, the probability will be one, and if it is impossible for it to happen, the probability will be zero. An experiment is the process by which one observation is obtained. An example of an experiment would be the sorting out of defective parts from a production line. An event is the outcome of an experiment.

Determining the number of employees who come to work late twice a month is an experiment, and there are many possible events; the possible outcomes can be anywhere between zero and the number of employees in the company. A sample space is the set of all possible outcomes in an experiment.

Table 4. TABLE 4. A random variable is said to be discrete when all the possible outcomes are countable. The four most used discrete probability distributions in business operations are the binomial, the Poisson, the geometric, and the hypergeometric distributions.

For the Introduction to Basic Probability 75 remainder of this section, p will be considered as the probability for a success and q as the probability for a failure. The variable x may take any value from zero to n and nCx represents the number of possible outcomes that can be obtained. What is the probability of having only 2 bottles that pass audit in a randomly selected sample of 7 bottles?

This result can also be found using the binomial table found in Appendix section. Using Minitab. Minitab has the capabilities to calculate the probabili- ties for more than just one event to take place.

Fill in the selected column C1 as shown in Figure 4. We are looking for the probability of an event to take place, not for the cumulative probabilities or their inverse. The number of trials is 7, 76 Chapter Four Figure 4. The Input column is the one that contains the data that we are looking for and the Output column is the column where we want to store the results; in this case, it is C3.

Figure 4. A machine produces ceramic pots, and What is the probability of selecting 3 pots that weigh 5 pounds in a randomly selected sample of 8 pots?

The following example deals with an event that occurs over a continuum, and therefore can be solved using the Poisson distribution. Example A product failure has historically averaged 3.

What is the probability of 5 failures in a randomly selected day? The same result can be found in the Poisson table on Appendix 2. The output is shown in Figure 4. The same process can be found for Excel. The Excel output is shown in Figure 4. A machine has averaged a 97 percent pass rate per day. What is the probability of having more than 7 defective products in one day? But because it is Figure 4. Defects per unit DPU and yield. Consider a company that manufactures circuit boards.

A circuit board is composed of multiple elements such as switches, resistors, capacitors, computer chips, and so on. To measure the quality of his throughput, the manufacturer will want to know how many defects are found per unit. Because there are multiple parts per unit, it is conceivable to have more than one defect on one unit.

What is the probability of having one defect? The objective of a manufacturer is to produce defect-free products. A yield measures the probability of a unit passing a step defect-free, and the rolled throughput yield RTY measures the probability of a unit passing a set of processes defect-free. What is the DPU?

The products have to be pulled from the trucks by transfer associates, they are systemically received by another associate before being transferred again from the receiving dock to the setup stations, and from there they are taken to the packaging area where they are packaged before being transferred to their stocking locations. On the outbound side, orders are dropped into the system and allocated to the different areas of the warehouse, and then they are assigned to the picking associates who pick the products and take them to the packing stations where another associate packs them.

After the products are packed, they are taken to the shipping dock where they are processed and moved to the trucks that will ship them to the customers. So the number of opportunities for a part that goes through the 12 processes to be defective will be Each part 84 Chapter Four has 60 opportunities to be defective. What is the DPMO? If successive trials are performed without replacement and the sample size or population is small, the probability for each observation will vary.

We are informed that 8 defective parts were shipped by mistake, and 5 parts have already been installed on machines. What is the probability that exactly 1 defective part was installed on a machine? Here the process of the hyper-geometric distribution is the same as for the Poisson and binomial distributions. The results appear as shown in Figure 4.

The result appears in Formula result. A sample of 7 items is taken from a population of 19 items containing 11 blue items. What is the probability of obtaining exactly 3 blue items? An example of a random variable would be the time it takes a production line to produce one item. The main continuous distributions used in quality operations are the normal, the exponential, the log-normal, and the Weibull distributions.

The Poisson distribution is built on discrete random variables and describes random occurrences over some intervals, whereas the exponential distribution is continuous and describes the time between random occurrences.

Examples of an exponential distribution are the time between machine breakdowns and the waiting time in a line at a supermarket. X Figure 4. What is the probability that the time until the line stops again will be more than 15 months?

What is the probability that the time until the line stops again will be less than 20 months? What is the probability that the time until the line stops again will be between 10 and 15 months? Most of nature and human characteristics are normally distributed, and so are most production outputs for wellcalibrated machines.

When a population is normally distributed, most of the observations are clustered around the mean. The mean, the mode, and the median become good measures of estimates. The average height of an adult male is 5 feet and 8 inches. This does not mean all adult males are of that height but more than 80 percent of them are very close.

The weight and shape of apples are very close to their mean. The curve associated with that function is bell-shaped and has an apex at the center.

The area between the curve and the horizontal line is estimated to be equal to one. Mean a b Figure 4. For a sigma-scaled normal distribution, the area under the curve has been determined. Approximately The shape of the normal distribution depends on two factors, the mean and the standard deviation. Consider the following example. This area is 0. We can use Excel to come to the same result. The production of light bulbs is normally distributed. The manufacturer wants to set the minimum life expectancy of the light bulbs so that less than 5 percent of the bulbs will have to be replaced.

What minimum life expectancy should be put on the light bulb labels? Solution The shaded area in Figure 4. What is the probability that the number of defective parts for a randomly selected sample will be less than 15?

So the probability that the number of defective parts will be less than 15 is 0.

Bass I. Six Sigma Statistics with EXCEL and MINITAB

A random variable is said to be log-normally distributed if its logarithm is normally distributed. Because the lognormal distribution is derived from the normal distribution, the two share most of the same properties.

It is a method very often used in quality control. In a large scale production environment, testing every single product is not costeffective because it would require a plethora of manpower and a great deal of time and space. Consider a company that produces , tires a day. If the company is open 16 hours a day two shifts and it takes an employee 10 minutes to test a tire, the testing of all the tires would require one million minutes, or 16, hours, and the company would need at least employees in the quality control department to test every single tire that comes out of production, as well as a tremendous amount of space for the QA department and the outbound inventory.

Machine productions are generally normally distributed when the machines are well-calibrated. For a normally distributed production output, taking a sample of the output and testing it can help determine the quality level of the whole production. Depending on the type of data being analyzed and the purpose of the analysis, several methods can be used to collect samples. First of all, it is necessary to distinguish between random and nonrandom sampling. In a random sampling, all the items in the population are presumed identical and they have the same probability of being selected for testing.

For instance, the products that come from a manufacturing line are presumed identical and the auditor can select any one of them for testing. Albeit sampling is more often random, nonrandom sampling is also used in production.

For example, if the production occurs over 24 hours and the auditor only works 4 hours a day, the samples he or she takes cannot be considered random because they can only have been produced by the people who work on the same shift as the auditor. For instance, if we are testing the performance of a machine based on its output and it produces several different products, when sampling the products it would be more effective to subgroup the products by similarities.

In cluster sampling, every grouping is representative of the population; the items it contains are diverse. The samples and their means will be distributed as shown in Table 5. TABLE 5. Based on the data in Table 5. The difference is known as the sampling error. Suppose a population of 10 bolts has diameter measurements of 9, 11, 12, 12, 14, 10, 9, 8, 7, and 9 mm. These differences are said to be due to chance. In that example, we had 10 bolts, and if all possible samples of three were computed, there would have been samples and means.

Based on the population given in Table 5. Why use sampling as a means of estimating the population parameters? The Central Limit Theorem can help us answer these questions. Solution Because the sample size is greater than 30, the Central Limit Theorem can be used in this case even though the number of defects per board follows a Poisson distribution. Therefore, the distribution of the sample mean x is approximately normal with the standard deviation 2.

We must determine the probability of having the mean between and However, when the data are countable—as in How to Determine, Analyze, and Interpret Your Samples the case of people in a group, or defective items on a production line— the sample proportion is the statistic of choice.

Example In a sample of workers, 25 might be coming late once a week. Example Forty percent of all the employees have signed up for the stock option plan. An HR specialist believes that this ratio is too high.

What is the probability of getting a sample proportion larger than this if the population proportion is really 0. The probability of getting a sample proportion larger than 0. The engineers want to be able to date each light bulb to determine its longevity, yet it is not possible to test each light bulb in a production process that generates hundreds of thousands of light bulbs a day.

But they can take a random sample and determine its average longevity, and from there they can estimate the longevity of the whole population. Using the Central Limit Theorem, we have determined that the Z value for sample means can be used for large samples. Example A survey was conducted of companies that use solar panels as a primary source of electricity. The question that was asked was this: How much of the electricity used in your company comes from the solar panels?

A random sample of 55 responses produced a mean of 45 megawatts. Suppose the population standard deviation for this question is In other words, the probability for the mean to be between The mean number of phone calls received at a call center per day is calls with a standard deviation of In fact, the Z formula has been determined not to always generate normal distributions for small sizes if the population is not normally distributed.

She takes a random Chapter Five sample of 19 cars that produces the following number of times the cars are rented in a month. The Minitab output will be 5. For instance, they would want to know how much How to Determine, Analyze, and Interpret Your Samples variation the production process exhibits about the target to see what adjustments are needed to reach a defect-free process.

And statistics minitab pdf sigma excel with six

We have seen that if the means of all possible samples are obtained and organized we can derive the sampling distribution of the means. The same principle applies to the variances, and we would obtain the sampling distribution of the variances. Example A sample of 9 screws was taken out of a production line and the sizes of the diameters are shown in Table 5. From the data in Table 5. But again, the question of the sample size arises. Should we consider a sample of or of products from a production line to determine the quality level of the output?

She wants to be within two minutes of the actual length of time, and the standard deviation of the average time spent is known to be three minutes. The hypothesis testing is about assessing the validity of a hypothesis made about a population. A hypothesis is a value judgment, a statement based on an opinion about a population.

It is developed to make an inference about that population. A test must be conducted to determine if the empirical evidence does support the hypothesis. Some examples of hypotheses are: The average number of defects per circuit board produced on a given line is 3.

The lifetime of a given light bulb is hours.

Six Sigma Statistics with EXCEL and MINITAB.pdf

It will take less than 10 minutes for a given drug to start taking effect. Therefore, a sample will be taken and an inference will be made for the whole population. Because the company produces thousands of boards a day, it would not be cost effective to test every single board to validate or reject that statement, so a sample of boards is analyzed and statistics computed. Based on the results found and some decision rules, the hypothesis is or is not rejected.

If exactly 10 percent or 29 percent of the defects on the sample taken are actually traced to the CPU socket, the hypothesis will certainly be rejected, but what if Should the 0. Should we reject the statement in this case? To answer these questions, we must understand how a hypothesis testing is conducted. There are six steps in the process of testing a hypothesis to determine if it is to be rejected or not beyond a reasonable doubt. The following six steps are usually followed to test a hypothesis.

In the case of the circuit boards at Sikasso, the hypothesis would be: But if enough evidence is statistically provided that the null hypothesis is untrue, an alternate hypothesis should be assumed to be true.

That alternate hypothesis, denoted H1 , tells what should be concluded if H0 is rejected. The objective here is to generate a single number that will be compared to H 0 for rejection. That number is called the test statistic.

Suppose that in the case of the defects on the circuit boards, a sample of 40 boards was randomly taken for analysis and 45 percent of the defects were actually found to be traceable to the CPU sockets. In that case, we would have rejected the null hypothesis as false. But what if the sample were taken from a substandard population?

We would have rejected a null hypothesis that might be true. We therefore would have committed what is called the Type I or Alpha error. We would have assumed the null hypothesis to be true when it actually is false. The one-tailed right-tailed graph in Figure 6. This level often varies between 0. The critical Z-value is obtained from the Z score table by using the 0. Another way to solve it. Because the mean obtained from the sample is 78,, we cannot reject the null hypothesis.

The results obtained do not allow a comparison with a single value to make an assessment; any value of X that falls within that interval would lead to a non-rejection of the null hypothesis.

For instance, in the example above the p-value is 0. We cannot reject the null hypothesis in this case. Example The diameter of the shafts produced by a machine has historically been 5.

Six Sigma Statistics with EXCEL and MINITAB

The old machine has been discarded and replaced with a new one. The reliability engineer wants to make sure that the new machine performs as well as the old one. We want to test the validity of the null hypothesis, H 0: Check the option Perform hypothesis test. One-Sample Z: The mean of the sample is 5.

If the value of the sample mean falls within this interval, we cannot reject the null hypothesis. The value of the sample mean 5. The p-value, 0. The formula for the t test resembles the one for the Z test but the tables used to compute the values for Z and t are different.

The mean thickness was historically 0. A Quality Assurance manager wants to determine if the age of the machine is causing it to produce poorer quality gaskets. Solution The null hypothesis should state that the population mean is still 0.

The value of t that we will be looking for is t0. We conclude that we cannot reject the null hypothesis. The p-value of 0. In this situation, the Central Limit Theorem can be used, as in the case of the distribution of the mean: Example A design engineer claims that 90 percent of the of alloy bars he created become PSI pound per square inch strong 12 hours after they are produced.

In a sample of 10 bars, 8 were PSI strong after 12 hours. Solution In this case, the null and alternate hypotheses will be H 0: The last 17 days, the standard deviation has been 3.

Doing it another way. The same result can be obtained another way. Very often, it is not enough to be able to make statistical inference about one population. We sometimes want to compare two populations. A quality controller might want to compare data from a production line to see what effect the aging machines are having on the production process over a certain period of time. A manager might want to know how the productivity of her employees compares to the average productivity in the industry.

In this section, we will learn how to test and estimate the difference between two population means, proportions, and variances. The Central Limit Theorem applies in this case, too. For the same month, the average productivity per employee at Cazamance Electromotive was machines per hour with a standard deviation of 9 machines.

If 45 employees at Senegal-Electric and 39 at Cazamance Electromotive were randomly sampled, what is the probability that the difference in sample averages would be greater than 20 machines? Therefore, the probability that the difference in the sample averages will be greater than 20 machines is 0. In other words, there exists a 3. At least two conditions must be considered—the approach we take when making an inference about the two means depends on whether their variances are equal or not.

If one or both samples are smaller than 30, the t statistic must be used. The estimate S p2 based on the two sample variances is called the pooled sample variance. A sample of 15 items was taken from Population 1 with a standard deviation of 3, and a sample of 19 items was taken from Population 0 with a standard deviation of 2. Find the pooled sample variance. To determine if there is a difference in the mean of the CSI in the two plants, random samples are taken over several weeks. For the Kayor plant, a sample of 17 weeks has yielded a mean of 96 CSI and a standard deviation of 3, and for the Matam plant, a sample of 19 weeks has generated a mean of 98 CSI and a standard deviation of 4.

At the 0. From the t table, we obtain t 0. Because we are faced with a two-tailed graph, the graph that illustrates the results obtained in the previous example should look like that in Figure 6. Because t 0. Solution The null and alternate hypotheses in this case would be: Figure 6. In those cases, the tests of hypothesis are more often about the variance. Most statistical tests for the mean require the equality of the variances for the populations.

The hypothesis testing of two population variances is done using samples taken from those populations. The F distribution is used in this case. The graph for an F distribution is shown in Figure 6. Here again, what must be compared are the calculated F and the critical F obtained from the table. Two values are of interest: The critical Hypothesis Testing Figure 6. Example Kolda Automotive receives gaskets for its engines from two suppliers. He takes a sample of 10 gaskets from supplier A and 12 from supplier B and obtains a standard deviation of 0.

Solution The null and alternate hypotheses will be: The critical F for the upper tail is F0. The critical F for the lower tail is the inverse of this Chapter Six value, F0.

Rejection area Rejection area 0. Several options are given by Minitab to test the normality of data. If the data are normally distributed, they all should be close to the mean and when we plot them on a graph, they should cluster closely about each other.

The null hypothesis for normality will be H 0: The data are normally distributed. The data are not normally distributed. To run the normality test, Minitab offers several options. For this example, we will run the Anderson-Darling Test. The output we obtain should look like Figure 6. It is clear that the dots are not all closely clustered about the regression line, and they follow a certain pattern that does not suggest normality.

In fact, improving productivity and enhancing customer satisfaction must go together because productivity improvement enables companies to lower the cost of quality improvement. One way of improving productivity is through the reduction of defects and rework. The reduction of rework and defects is not achieved through inspection at the end of production lines; it is done by instilling quality in the production processes themselves and by inspecting and monitoring the processes in progress before defective products or services are generated.

The prerequisites for improving customer satisfaction while improving productivity address two aspects of operations: Variations are nothing but deviations from the preset targets, and no matter how well controlled a process is, variations will always be present.

The causes of the variations are divided into two categories: Deming or random W. Shewhart when they are inherent to the production process. Machine tune-ups are an example of common causes of variation.

Statistics six pdf excel sigma minitab with and

Deming or assignable W. Shewhart when they can be traced to a source that is not part of the production process. A sleepy machine operator would be an example of an assignable cause of variation To be able to predict the quality level of the products or services, the processes used to generate them must be stable.

The stability refers to the absence of special causes of variation. Statistical Process Control SPC is a technique that enables the quality controller to monitor, analyze, predict, control, and improve a production process through control charts.

Control charts were developed as a monitoring tool for SPC by Shewhart; they are among the most important tools in the analysis of production process variations. A typical control chart plots sample statistics and is made up of at least four lines: They are not outside the control limits and based on their pattern, the process trends can be predicted because the variations are strictly due to common causes. The control chart shown in Figure 7.

The purpose of using control charts is: The control charts help detect the assignable causes of variation in time so that appropriate actions can be taken to bring the process back in control. Most production processes allow operators a certain level of leeway to make adjustments on the machines that they are using when it is necessary. Yet over-adjusting machines can have a negative impact on the output. Control charts can indicate when the adjustments are necessary and when they are not.

If the production process is not monitored, defective products will be produced resulting in extra rework or defects sent to customers. Being able to predict the variation of the quality level of a production process is very important because the variations determine the quantity of defects and the amount of work or rework that might be required to deliver customer orders on time.

Samples must be taken at preset intervals and tested to make sure that the quality of the products sent to customers meets their expectations. If the products are Chapter Seven found to be defective, the reasons for the defects are investigated and adjustments are made to prevent future defects.

Making adjustments to the production process does not necessarily lead to a total elimination of variations; in some cases, it may even lead to further defects if done improperly or done when not warranted. While the production process is in progress, whether adjustments are made or not, the process continues to be monitored, samples continue to be taken, and their statistics plotted and trends observed.

The expected amount of defects that the process produces is measured by a method called Process Capability Analysis, which will be dealt with in the next chapter. Consider the length as being the critical characteristic of manufactured bolts. The mean length of the bolts is 17 inches with a known standard deviation of 0.

Control charts are an effective tool for detecting the special causes of variation. One of the most visible signs of assignable causes of variation is the presence of an outlier on a control chart. If some points are outside the control limits, this will indicate that the process is out of control and corrective actions must be taken. The chart in Figure 7. The process seems to be stable with only common variations until Sample 25 was plotted. That sample is way outside the control limits.

Because the process had been stable until that sample was taken, something unique must have happened to cause it to be outside the limits. The causes of that special variation must be investigated so that the process can be brought back under control.

Figure 7. These two charts are completely separate entities. A process can be within the control limits with a high variability, or too many of the plotted points are too close to one control limit and away from the target.

In this example, all the plots are well within the limits but the circled groupings do not behave randomly—they exhibit a run-up pattern. In other words, they follow a steady increasing trend.

The causes of this run-up pattern must be investigated because it might be the result of a problem with the process. Western Electric WECO published a handbook in to determine the rules for interpreting the process patterns.

A process is said to be out-of-control if one the following occur: When the process is out-of-control, production is stopped and corrective actions are taken. The corrective actions start with the determination of the category of the variation. The causes of variation can be random or assignable.

If the causes of variation are solely due to chance, they are called chance causes Shewhart or common causes Deming. In this case, the variations are said to be due to assignable causes Shewhart or special causes Deming. Finding and correcting special causes of variation are easier than correcting common causes because the common causes are inherent to the process. The types of charts used for attribute data are: The p-chart is used when dealing with ratios, proportions, or percentages of conforming or nonconforming parts in a given sample.

A good example for a p-chart is the inspection of products on a production line. They are either conforming or nonconforming. Because the products are only inspected once, the experiments are independent from one another. Example Table 7. We want to build a control chart that monitors the proportions of defects found on each sample taken. TABLE 7. In this case, we can say that the process is stable and under control because all the plots are within the control limits and the variation exhibits a random pattern around the mean.

One of the advantages of using the p-chart is that the variations of the process change with the sizes of the samples or the defects found on each sample. The np-chart. The np-chart is one of the easiest to build. While the p-chart tracks the proportion of nonconformities per sample, the npchart plots the number of nonconforming items per sample. Sample 2 was of size 21 and had 2 defects, and Sample 34 was of size 31 and had 2 defects, and they are both plotted at the same level on the chart.

The chart does not plot the defects relative to the sizes of the samples from which they are taken. For that reason, the p-chart has superiority over the np-chart. Consider the same data used to build the chart in Figure 7. We obtain the chart shown in Figure 7. If the sample size for the p-chart is a constant, the trends for the p-chart and the np-chart would be identical but the control limits would be different. The p-chart in Figure 7. The c-chart. The c-chart is useful for the process engineer to know not just how many items are not conforming but how many defects there are per item.

Knowing how many defects there are on a given part produced on a line might in some cases be as important as knowing how many parts are defective. Here, nonconformance must be distinguished from defective items because there can be several nonconformities on a single defective item. If the sample size does not change and the defects on the items are fairly easy to count, the c-chart becomes an effective tool to monitor the quality of the production process.

We want to build a control chart to monitor the production process and determine if it is stable and under control. Statistical Process Control Figure 7. Sample 65 is beyond three standard deviations from the mean. Something special must have happened that caused it to be so far out of the control limits.

The process must be investigated to determine the causes of that deviation and corrective actions taken to bring the process back under control.

The u-chart. One of the premises for a c-chart is that the sample sizes had to be the same.

Bass I. Six Sigma Statistics with EXCEL and MINITAB [PDF] - Все для студента

The sample sizes can vary when a u-chart is being used to monitor the quality of the production process, and the u-chart does not require any limit to the number of potential defects. Furthermore, for a p-chart or an np-chart the number of nonconformities cannot exceed the number of items on a sample, but for a u-chart it is conceivable because what is being addressed is not the number of defective items but the number of defects on the sample. The control limits are determined based on u and the mean of the samples, n: The products are assembled in kits of 70 per unit before they are sent to the customers.

McGraw-Hill eBooks are available at special quantity discounts fo use as premiums and sales promo- tions, or for use in corporate training programs. Use of this work is subject to these terms. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited.

Your right to use the work may be terminated if you fail to comply with these terms. McGraw-Hill and its licensors do not warrant or guarantee that the func- tions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccu- racy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work.

This limitation of lia- bility shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise DOI: More From ppaba Dch Narrasimhan.