Statistics is a separate stream of sciences deals with collection, managing, manipulation and interpretation of data. This subject is very useful for research related purposes. It is concerned with all types of manipulation of data collection and interpretation. Sometimes it is also defined as a branch of mathematics statistics. Statistics is used to summarize and arranging the available data collected from various sources. Sometimes statistics is also defined as a subordinate branch of mathematics due to uses of mathematical elements and formulas. Although statistics covers a wide range of manipulation but its some wellknown tests and theorems are Analysis of variance (ANOVA), Chisquare test, Correlation, Factor analysis, Mann – Whitney U, Mean square weighted deviation (MSWD), Pearson productmoment correlation coefficient, Regression analysis, Spearman’s rank correlation coefficient, Student’s ttest and Time series analysis. Statistics has many subdisciplines including Actuarial science, Applied information economics, Biostatistics, Business statistics, , chemometrics, Chemometrics (for analysis of data from chemistry), Data mining (applying statistics and pattern recognition to discover knowledge from data), , Demography, , Econometrics, Energy statistics, Engineering statistics, environmental statistics, , Epidemiology, geochemistry, Geography and Geographic Information Systems (specifically in Spatial analysis), , geostatistics, hydrogeology, hydrology, Image processing, meteorology, oceanography, operations research (or operational research), petroleum geology, population ecology, , Psychological statistics, psychometrics, quality control, quantitative psychology, Reliability engineering, Social statistics, statistical finance (econophysics), statistical mechanics, statistical physics and statistical thermodynamics. Statistics is used in various researches and education fields like engineering, biology, geography, economics, finance, insurance, biochemistry, demography, Quantitative psychology, process control, probability, econophysics and thermodynamics.
Correlation is a broader term of statistical relationship between two or more observed data values or random variables. These are useful due to its quality of indicating a predicative relationship which can be explored and used in general practice. Correlation can suggest possible relationship whether it is casual or mechanistic. Usually correlation is used to refer a specific relationship between two or more random variables or mean values. Correlation uses various correlation coefficients to measure the degree of relationship. Major correlation coefficients are: Pearson correlation coefficient: It is also known as Pearson productmoment correlation coefficient and is sensitive to a linear relationship between two variables while any one of them may be a nonlinear function. It is computed by dividing the covariance of two variables by the calculated value of their standard deviations. Pearson correlation coefficient: It is also known as Pearson productmoment correlation coefficient and is sensitive to a linear relationship between two variables while any one of them may be a nonlinear function. It is computed by dividing the covariance of two variables by the calculated value of their standard deviations. The Pearson coefficient is 1 in the case of perfect decreasing linear relationship (or anticorrelation) and becomes +1 if the case has a perfect positive linear relationship. While in the case of coefficient is zero it shows less of a relationship in the variables. Rank correlation coefficients: It is quite different than Pearson correlation coefficient and measures a different type of association. It is used in the case of one variable increases and the other variable does not. Distance correlation was introduced to complete the deficiency of Pearson’s correlation that it may be zero for dependent variables. Brownian correlation or covariance: Like distance correlation it was also designed to fulfill the deficiency of Pearson’s coefficient. Partial correlation: It is used to measure the strength of dependence between two variables. Besides this, correlation has several more types of variation depending on their properties. Generally these are Association (statistics), Autocorrelation, Brownian covariance, Canonical correlation, Coefficient of determination, Concordance correlation coefficient, Cophenetic correlation, Correlation function, Crosscorrelation, Currency correlation, Distance correlation, Ecological correlation, Fraction of variance unexplained, Genetic correlation, Goodman and Kruskal’s lambda, Illusory correlation, Interclass correlation, Intraclass correlation, Linear correlation (wikiversity), Modifiable a real unit problem, Multiple correlation, Pointbiserial correlation coefficient, Statistical arbitrage, Sub independence.
 Regression Analysis is a way of statistics to predict and forecast. It includes all techniques of modeling and analyzing several variables, in case of we have focus on the relationship between one or more independent variables and a dependent variable. Usually regression analysis helps to understand the changes in typical values of the dependent variables while anyone of the independent variable varies and other independent variables are fixed. It is also used to estimate the conditional expectations of the dependent and independent variables. Likewise, the independents variables estimation target is known as regression function and it is also used to define characterize the variation of the dependent variables which can be described with the help of a probability distribution. Method of least squares was the earliest form of regression which was discovered by Legendre in 1805 and Gauss in 1809. Both used the method to solve the problem of determination of astronomical observations as the orbits of bodies around the Sun. Later in 1821, Gauss published his work as GaussMarkov theorem. Regression analysis got its name from Francis Galton who used it in observations of a biological phenomenon and he used it to solve the problems of biology. But other researchers helped regression analysis to be more useful and applicable in other streams of education. Regression analysis model uses following variables:
 The unknown parameters (shown as B)
 The independent variable (known as X)
 The dependent variable (known as Y)
Besides this, regression analysis encircles various topics of statistical analysis including statistical assumption, Linear regression, Underlying assumptions, General linear model, Regression diagnostics, Regression with “limited dependent” variables, Interpolation and extrapolation, Nonlinear regression, Power and sample size calculations and so on.
Random variable or stochastic variable is a value which is calculated from a measurement or manipulation of some random process. Random variable is a function of probability and statistics to maps the events or results (outcomes). It can be defined as possible values to represent the possible outcomes of a performed experiment or test. A Random variable can have different values at a time depending on its probability distribution (which is used to describe the probabilities of different occurring values). In brief, realizations of random variable are defined as random variables. Some good examples of random variables are Boolean values, complex numbers, functions, matrices, manifolds, sequences, sets, shapes and processes. Random variables are used in the sciences and engineering to make prediction based on available dated acquired by the scientific experiments. These are also a tool to make predictions in games of chance and stochastic events. It is also used to compute Chisquare distribution. We can categorize random variables in three different categories Discrete random variables: are used to map the outcomes to values of a countable set such as integers while each value of the set has probability greater than or equal to zero. Continuous random variables: In continuous random variables, probability of a specific value becomes zero meanwhile the probability of two or more infinite sets of values may be positive. Mixed random variables: Besides both these types of random variables, a random variable may be mixed having qualities of both variations of its. All the three types of categorization of random variation are equivalent to the classification of probability distributions.
 In statistics, probability distribution is used to identify either the probability of the value falling within a particular interval (variable is continuous) or the probability of every value of a random variable (with discrete variable). It describes the range of possible values which can be attained by a random variable. Probability distribution has several variations depending on its basic qualities and principles. Usually we can classify a probability distribution on the base of: Related to realvalued quantities that grow linearly:
 Related to positive realvalued quantities that grow exponentially
 Related to realvalued quantities that are assumed to be uniformly distributed
 Related to Bernoulli trials
 Basic distributions
 Related to sampling schemes over a finite population
 Related to categorical outcomes
 Related to events in a Poisson process
 Useful for hypothesis testing related to normallydistributed outcomes
 Useful as conjugate prior distributions in Bayesian inference
Besides this, probability distribution has various forms including Bernoulli, Beta, Betabinomial, Binomial, Boltzmann, Bose – Einstein, Cantor, phasetype, truncated and the mixture, categorical, Cauchy, Chernoff’s, Chisquare, Continuous uniform, Conway – Maxwell – Poisson, Degenerate, Degenerate, Dirac delta function, Discrete uniform, Exponential, Exponential, Fermiâ – Dirac, Fisher’s noncentral hyper geometric, Fisher’s zdistribution, Gamma, Generalized extreme, logistic and normal value, Geometric, Gibbs, Hotelling’s Tsquare, Hyperbolic, Hypergeometric, Inverse Gaussian distribution, Inversegamma, IrwinHall, Kent, Kumaraswamy, Laplace, LÃ©vy, Logarithmic (series), Logistic, Logitnormal, Maxwell – Boltzmann, Multinomial, Negative binomial, Noncentral chi, Normal exponentialgamma, Parabolic fractal, Pareto, Pearson Type III distribution, Poisson binomial, Rademacher, Raised cosine, Rectangular, Rice, Scaleinversechisquare and forms of chi, Skellam, Skellam, Triangular, Von MisesFisher, Wallenius’ noncentral hypergeometric, Weibull or Rosin Rammler, Wishart and inverseWishart, Yule – Simon, Zeta and Zipf’s law.
Bayes’ Theorem or Bayes ‘Law or Bayes’ rule is used to express the conditional probability or posterior probability of a hypothesis H in terms of prior probability of H. It is based on the concept of that evidence effects if it more likely given H than not – H. It is generally applied in engineering and sciences and is valid in all usual interpretations of probability. Bayes’ theorem was discovered by the Reverend Thomas Bayes in 17th century during his study on computation of a distribution for the probability parameter of a binomial distribution. His work was edited and published by his friend Richard Price after his death in 1763 with the title of An Essay towards solving a Problem in the Doctrine of Chances. In Bayes’ theorem we define each probability with a name as: P(A) is described as the prior probability of A P(AB) is known as conditional probability of A given B or posterior probability P (BA) is also known as likelihood and is the conditional probability of B given A. P(B) is defined as prior or marginal probability of B Mostly Bayes’ theorem is used in various fields of research work such as drug testing and Bayesian inference. Presently Bayes’ Theorem of Bayes’ Law is used to computer various variables as well as is used to reject the idea of the God’s presence.
Binomial distribution is the discrete probability distribution of the number of successes in a sequence of independent yes/no experiments, each of which yields success with probability. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, the binomial distribution is a Bernoulli distribution. Binomial distribution is the basis of probability and statistics as well as binomial test of statistical significance. Binomial distribution is frequently used to identify the numbers of successes in a sample of data while if the samples are independent, the resulting distribution will be identified as hyper geometric distribution. Binomial distribution is used in Binomial probability, Binomial inverse theorem, Binomial series, Combination, Sterling’s approximation, Multinomial theorem, Negative binomial distribution, Pascal’s triangle and Binomial approximation. Binomial distribution can be specified in several terms depending on the used method such as probability mass function and cumulative distribution function (including mean, mode and median). Binomial distribution is a topic of statistics and mathematics related with other distributions too like Bernoulli distribution, Poisson binomial distribution, Normal approximation, etc.
Probability is a way of expressing knowledge or belief that an event will occur or has occurred. The concept has an exact mathematical meaning in probability theory, which is widely used in such areas of study as mathematics, statistics, finance, gambling, science, Artificial intelligence/Machine learning and philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of complex systems. Probability theory is a branch of applied mathematics and is used in various branches of mathematics and science as well as our everyday life in risk assessment in business, commodity market, insurance sector, economics, etc. The term Probability is originated from the Latin word probabilitas which means probity or measure of the authority/nobility of a witness. Although this is different from the modern definition of the probability but Latin meaning is quite closure to in the form of statistical inference. Probability was invented and started in 16th century by Pierre de Fermat, Blaise Pascal and Christian Huygens while Jakob Bernoulli and Abraham de Moivre’s effort gave it reputation as a discipline of mathematics. Probability theory is a representation of maximum probabilistic concepts in an equation or situation and is concerned with the uncertainty of a question. Through probability theory formal terms of the question are manipulated by the rules of logic and mathematics to know the maximum possible choices of certainty of specific answers, solution or a decision. It is based on randomness and uses various complex mathematics functions and elements to solve the problems including Black Swan theory, Calculus of predispositions, Chance (disambiguation), Class membership probabilities, Decision theory, Equiprobable, Fuzzy measure theory, Game theory, Gaming mathematics, Important publications in probability, Information theory, List of scientific journals in probability, List of statistical topics, Measure theory, Negative probability, Probabilistic argumentation, Probabilistic logic, Random fields, Random variable, Stochastic process, Wiener process, etc. This could be a very challenging aspect of statistics for many students. Our team would be happy to help you if you get stuck with any problem of probability. You can place an order if you have any probability assignment or homework help query.
An Expectation Maximization (EM) algorithm is a method for finding maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM is a way to alternates between performing an expectation to compute the expectation of the loglikelihood calculated by using the current estimate for variables and a maximization step (M) which is used to calculate parameters maximizing the expected loglikelihood in an E step. EM is a branch of statistical mathematics and has an importance in statistical mathematics. Expectationmaximization was explained first time in 1977 by Arthur Dempster, Donald Rubin and Nan Laird (also known as DempsterLairdRubin) and they published their paper in the Journal of the Royal Statistical Society. Later the detailed treatment of the EM method for exponential families was discovered by Rolf Sundberg. Expectationmaximization is frequently used for data clustering in computer science and machine learning and helps in natural language processing. Besides this, EM is an essential to estimate item parameters and latent abilities of models of item response theory in psychometrics. Likewise EM is also used in medical image reconstruction including single photon emission computed tomography and positron emission tomography. EM has two major variations depending on the methods proposed to accelerate (e.g. conjugate gradient and modified NewtonRaphson). These are – Expectation conditional maximization (ECM) and Generalized expectation maximization (GEM) algorithm.
The term Hypothesis defines to propose explanation of an observable phenomenon. Hypothesis is originated from the Greek word hypotithenai means “to suppose” or “to put under.” A hypothesis is used to understand the phenomenon with the help of scientific theories and logics. It is quite different than scientific theories as the term hypothesis makes a clear understanding of the antecedent of a proposition. Hypothesis is used in almost each and every field of the life to find out the most acceptable answers of phenomena, problems, questions or equation including sciences and engineering. There are many uses of hypothesis in modern sciences and arts. It is required by the researchers to check the confirmation of a theory. Hypothesis is classified in several categories based on their methodology and requires tools: Generally these are: Scientific hypothesis: It provides a suggestion solution of the problem on the basis of the evidences, experiments and tests. Its properties are simplicity, testability, scope, fruitfulness and conservatism (suitability with other available knowledge). Evaluation hypothesis: this type is based on Karl Popper’s formulation named conjectures and refutations. Like scientific hypothesis, evaluation hypothesis is not confirmed due to possibilities of other situation. It is a suggestive solution of a problem which may be false in the future. Statistical hypothesis testing: In statistical hypothesis testing, two different hypothesis (known as null hypothesis and alternative hypothesis) are compared. The null hypothesis denies the relation between the phenomena while the alternative tries to find suggestive relations in phenomena. Through this method both hypothesis are compared to know the accuracy and viability of the suggestive solution. After testing the alternative hypothesis is accepted while the null hypothesis is rejected.
Central limit theorem (or CLT) is a subject in the probability theory concerned with the conditions and number of independent random variables with finite mean and variance. According to the central limit theorem is any of a set of weakconvergence theories. They all express the fact that a sum of many independent random variables will tend to be distributed according to one of a small set of “attractor” (i.e. stable) distributions. When the variance of the variables is finite, the “attractor” distribution is the normal distribution. Specifically, the sum of a number of random variables with power law tail distributions decreasing as 1/xa + 1 where 0 < a < 2 (and therefore having infinite variance) will tend to a stable distribution with stability parameter (or index of stability) of a as the number of variables grows. But here we will discuss on the classical (i.e. finite variance) central limit theorem. The simplest example of the central limit theorem is the problems of rolling dice, in which each is weighted unfairly and in unknown way. The CLT provides explanation for the prevalence of general probability distribution as well as to justify the approximation of large sample statistics to the normal distribution especially in general experiment. The central limit theorem is also used to evaluate the reasonable approximation by using the asymptotic distribution and requires a large number of tests and observations. There are alternative statements of the Central limit theorem available which help to solve the problems with the help of variables, samples in a large numbers and various mathematical elements such as characteristic functions, multidimensional central limit theorem and Lindeberg condition.
According to wikipedia, “In statistics, analysis of variance (ANOVA) is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation. ANOVA facilitates the statistical test of the means of many groups in case all are equal and further generalizes ttest to two or more groups.” ANOVA helps to possess advantage over a two sample ttest. It is used to eliminate the type error occurred from multiple two sample ttests. ANOVA was first time used in 1800s with least squares, but Sir Ronald Fisher was the first person to publish it as a different function in an article entitled “The Correlation between Relatives on the Supposition of Mendelian Inheritance” in 1918. He also made it popular by including in his book Statistical Methods for Research Workers in 1925. ANOVA can be categorized in three classes – Fixed effects model are models of ANOVA, in which the data comes from normal population and may be differed in their means. Fixed effects model allows the researcher to calculate the ranges of various response variables values generated in the population. Random effects models are used to assuming that data describes a hierarchy of different population while their differences are applied by their hierarchy. These are applied in case when treatments are not fixed. Mixed effect models are used to describing the situation while the above both fixed and random effects are present. ANOVA is used in various forms of ANOVA such as oneway ANOVA (to test differences among two or more groups), Factorial ANOVA (to study the interaction effects), Repeated measures ANOVA (when same subjects are applied in each treatment) and Multivariate analysis of variance (aka MANOVA and is used in more than one response variables).
Minitab is a statistics computer program developed Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner at Pennsylvania State University in 1972. Minitab was developed as a light version of that time most famous statistical analysis program OMNITAB made by NIST. Presently Minitab is distributed by Minitab Inc., a private company based in State College, Pennsylvania with three subsidiaries in Coventry, England, (Minitab Ltd.), Paris, France (Minitab SARL) and Sydney, Australia (Minitab Pty.). Today, it is frequently used with Six Sigma, CMMI and other statistics based process and methods. Since its first version, Minitab Inc. has released its 16 different version in order to make it more perfect, efficient and useful for the users. Minitab offers all features and utilities of statistics to make computation easy and painless. Some of its features are Data and File Management, Basic Statistics, Graphics, Regression Analysis, Analysis of Variance, Design of Experiments, Statistical Process Control, Measurement Systems Analysis, Reliability/Survival Analysis, Multivariate Analysis, Time Series and Forecasting, Nonparametric, Tables, Power and Sample Size, Simulation and Distributions, Macros and Customizability. Besides this, Minitab also provides an assistant and user friendly tools to make it easy for a common person. It gives facility to check each and every test, theorems, distribution and mathematical formulas of statistics including the use of various correlation coefficient, variables and functions. Minitab comes with the inbuilt utilities to apply of various tests, distribution and user friendly graphical presentation tools such as Matrix functions, Spreadsheetlike worksheets, Query databases with ODBC, Data manipulation: merge, subset, sort, transpose, code, etc., Matrix functions, Correlation and covariance, Descriptive statistics, Goodnessoffit test for Poisson, Normality test, One and two proportions tests, variances tests, One and twosample Poisson rate tests, ttests, paired ttest, Onesample Z test, Interactively edit attributes (axes, labels, etc.), Probability and probability distribution plots, Binary, ordinal and nominal logistic regression, Easily create indicator variables, Linear, Nonlinear and Orthogonal regression, ANOVA, GLM, MANOVA, Response prediction, Individual distribution identification, Johnson transformation, MultiVari chart, Multivariate control charts: Tsquared, generalized variance, MEWMA, Analysis of multiple failure modes and repairable systems, Threshold parameter distributions, Auto, partial auto, and cross correlation functions, Chisquare, Fisher’s exact, and other tests, Tally and cross tabulation, etc. Normally Minitab is very useful for statistical calculations and data analysis. Our experts can help you with any assignment or homework related to Minitab.
 A Chisquare test or ?2 test is an statistical hypothesis test in which sampling distribution of the test statistics is defined as a chi square distribution while either the null hypothesis is true or it is asymptotically true. Generally in this situation the sampling distribution may be used to approximate the Chisquare distribution. A few examples of chisquared tests (if the chisquare distribution is approximately valid) are:
 Pearson’s chisquare test or chisquare goodness of fit test or also known as chisquare test for independence.
 Yates ‘Chisquare test or Yates’ correction for continuity
 Linearbylinear association chisquare test
 MantelHaenszel chisquare test
 Portmanteau test in timeseries analysis (a test for the presence of autocorrelation)
 Likelihoodratio test: it is used in general statistical modeling
Besides this, chisquare distribution helps to test the variance of a normally distributed population which has a sample variance based value. Although it is not a common practice is study due to its complication.
 SPSS s a computer software/program concerned with statistical analysis. SPSS is an abbreviated term for Statistical Package for the Social Sciences. It was developed by Norman H. Nie and C. Hadlai Hull in 1968 at Stanford University as Norman Nie was a science postgraduate in the university and was working on a research project. Now he is Research Professor at the University of Chicago. SPSS is the most frequently used software for statistical analysis in social sciences and is used by education researchers, students, government, marketing organizations, market researchers, survey companies, health researchers and others. SPSS was developed for research and manipulation and computation on the base of statistics. Its initial version had basic functionalities and test of statistics such as cross tabulation, descriptive, frequencies, descriptive ratio statistics, means, correlation (including bivariate, partial, distances), ANOVA, ttest and nonparametric tests, linear regression, factor analysis, cluster analysis (twostep, Kmeans and hierarchical) and discriminate. Previously it was distributed by PASW (Predictive Analysis Software) Statistics but later in 2009 all copyrights and rights were acquired by IBM and now it is a part of IBM. SPSS has several Addons to enhance its efficiency and usability. Some of them are.
 SPSS Advanced Models: It was a core utility till SPSS 14 but later it was removed from the core part and now is available as addon. It helps in multivariate GLM and repeated measures ANOVA.
 SPSS classification Trees: helps to create classification and decision tree to identify the groups and predicting behavior.
 SPSS Complex Samples: It was added in SPSS 12. It is used to make adjustment for stratification and clustering and other samples.
 SPSS Data Validation: Allows programming for logical checks and reporting of suspicious values.
 SPSS Exact Tests: is a utility for statistical testing on small samples
 SPSS Programmability Extension: given facility of programming control of SPSS
 SPSS Regression Model: extends SPSS abilities for logistic regression, multinomial logistic regression, ordinal regression and mixed models.
 SPSS Table: facilities user defined control on output of reports.
 SPSS Categories
 SPSS Conjoint
 SPSS Map
 SPSS Trends
Besides this, SPSS Server is a different version of SPSS with server architecture and comes with extra features which are not available in desktop/laptop version such as scoring functions (But now it has been included in SPSS 19).