Turkce
 
Home Page
Services
Regional Organization
Formulas
National Statistics
Socio-Economic Status
Statiztical Analysis
Important Links
Contact US
 
 

 

 

Statistical Analysis

 

Binomial Test

The binomial test procedure compares the observed frequencies of the two categories of a dichotomous variable to the frequencies expected under a binomial distribution with a specified probability parameter. In general, the probability distribution of the binomial for any probability p and number of n trials gives the probability of obtaining k successes in n trials.
Example: When a dime is tossed, the probability of heads equals ½. Based on this hypothesis, a dime is tossed 40 times, and the outcomes are recorded. From the binomial test, if it is found that ¾ of the tosses are heads and that the observed significance level is small (0.0027), these results indicate that it is not likely that the probability of a head equals ½; the coin is probably biased.

Cluster Analysis

The procedure attempts to identify relatively homogeneous groups of cases based on selected characteristics, using an algorithm that can handle a large number of cases.
Example: Some identifiable groups of television shows  attract similar audiences within each group. One could cluster television shows into k homogeneous groups based on viewer characteristics. This can be used to identify segments of markets. It is also possible to cluster cities into homogeneous groups so that comparable cities can be selected to test various marketing strategies.

Discriminant Analysis

Discriminant analysis is useful for situations where one wants to build a predictive model of group membership based on observed characteristics of each case. The procedure generates a discriminant function based on linear combinations of the predictor variables that provide the best discrimination between groups.
Example: On average, people in temperate zone countries consume more calories per day than those in the tropics, and a greater proportion of the people in the temperate zones are city dwellers. A researcher wants to combine this information in a function to determine how well an individual can discriminate between the two groups of countries. The researcher thinks that population size and economic information may also be important. Discriminant analysis allows one to estimate coefficients of the linear discriminant function.

Factor Analysis

Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often used in data reduction to identify a small number of factors that explain most of the variance observed in a much larger number of manifest variables. Factor analysis can also be used to generate hypotheses regarding causal mechanisms or to screen variables for subsequent analysis.
Example: In order to find underlying attitudes that lead people to respond to the questions in a political survey, one can examine the correlations among the survey items revealing that there is significant overlap among various subgroups of items—questions about taxes tend to correlate with each other, questions about military issues correlate with each other, and so on. With factor analysis, it is possible to investigate the number of underlying factors and, in many cases, one can identify what the factors represent conceptually.

Hypothesis Testing

Hypothesis testing is all about answering the question “could these observations really have occurred by chance?” The null hypothesis is usually that the observations are the result purely of chance. The alternate hypothesis is that there is a real effect and that, the observations are the result of this real effect, plus chance variation. Then the p value is a probability statement which answers the question: if the null hypothesis were true, then what is the probability of observing a test statistic at least as extreme as the one we observed. The p value is compared to a fixed significance level a. In scientific work, a fixed a-level of .05 or .01 is often used.

Chi-Square test

The chi-square test procedure tabulates a variable into categories and computes a chi-square statistic. Chi-square test could be used to determine if all categories contain the same proportion of values or that each category contains a user-specified proportion of values. The chi-square test can also be used to detect whether there is a significant association between two categorical variables.
Example: The chi-square test could be used to determine if a bag of jelly beans contains equal proportions of blue, brown, green, orange or red. The chi-square test can also indicate whether the type of training used has a significant effect on whether an animal (dog or cat) would dance.

Kolmogorov-Simirnov Test

The test procedure compares the observed cumulative distribution function for a variable with a specified theoretical distribution, which may be normal, uniform, Poisson, or exponential. This “goodness-of-fit test” tests whether the observations could reasonably have come from the specified distribution.
Example: Many parametric tests require normally distributed variables. The Kolmogorov-Simirnov test can be used to test that a variable, e.g: ‘income', is normally distributed.

Correlation Analysis

It is often interesting for researchers to know what relationship exists, if any, between two or more variables. A correlation is a measure of the linear relationship between variables. Although one cannot make direct conclusions about causality, the correlation coefficient “r” squared is a measure of the amount of variability in one variable that is explained by the other.
Example: One may look at the relationship between exam anxiety and exam performance.

Correspondence Analysis

One of the goals of correspondence analysis is to describe the relationships between two nominal variables in a correspondence table in a low-dimensional space, while simultaneously describing the relationships between the categories for each variable. For each variable, the distances between category points in a plot reflect the relationships between the categories with similar categories plotted close to each other.

Factor analysis is a standard technique for describing relationships between variables in a low-dimensional space. However, factor analysis requires interval data, and the number of observations should be five times the number of variables. Correspondence analysis, on the other hand, assumes nominal variables and can describe the relationships between categories of each variable, as well as the relationship between the variables.
Example: Correspondence analysis could be used to graphically display the relationship between staff category and smoking habits. One can find that with regards to smoking, junior managers differ from secretaries, but secretaries do not differ from senior managers. It is also possible to find that heavy smoking is associated with junior managers and, whereas light smoking is associated with secretaries.

Regression Analysis

In simple linear regression, the outcome variable Y is predicted using the equation of a straight line. Given that several values of Y and X have been collected, the unknown parameters in the equation can be calculated. They are calculated by fitting a model to the data (e.g. a straight line) for which the sum of the squared differences between the line and the actual data points is minimized. This method is called the method of least squares. Multiple regression is a logical extension of these principles to situations in which there are several predictors.
Example: One can find out the parameters that describe the relationship between record sales and the amount spent promoting the record.

One way Analysis of Variance (ANOVA)


The one way ANOVA procedure produces a one-way analysis of variance for a quantitative dependent variable by a single factor (independent) variable. Analysis of variance is used to test the hypothesis that several means are equal.
Example: Doughnuts absorb fat in various amounts when they are cooked. An experiment is set up involving three types of fat: penult oil, corn oil, and lard. Peanut oil and corn oil are unsaturated fats, and lard is a saturated fat. Along with determining whether the amount of fat absorbed depends on the type of fat used, you could set up an “a priori” contrast to determine whether the amount of fat absorption differs for saturated and unsaturated fats.

Wilconxon nonparametric test

The Wilconxon test is used in situations in which there are two sets of scores to compare, but these scores come from the same subjects.
Example: An experimental scientist may be interested in the change of depression levels, within subjects, for each of two different drugs.

References: Field, Andy (2000). Discovering Statistics, Sage Publications.

 
Top of the page
 

 

 
Copyright © 2004 Frekans Research Field & Data Processing Co. Ltd. All rights reserved..