Sticky post

Binomial Random Variables: Introduction

Binomial Random Variables So far, in our discussion about discrete random variables, we have been introduced to: The probability distribution, which tells us which values a variable takes, and how often it takes them. The mean of the random variable, which tells us the long-run average value that the random variable takes. The standard deviation of the random variable, which tells us a typical (or long-run average) distance between the mean of the random variable and the values it takes. We will now introduce a special class of discrete random variables that are very common, because as you’ll see, they … Continue reading Binomial Random Variables: Introduction

Introduction to Normal Random Variables: Overview

In the Exploratory Data Analysis sections of this course, we encountered data sets, such as lengths of human pregnancies, whose distributions naturally followed a symmetric unimodal bell shape, bulging in the middle and tapering off at the ends. Many variables, such as pregnancy lengths, shoe sizes, foot lengths, and other human physical characteristics exhibit these properties: symmetry indicates that the variable is just as likely to take a value a certain distance below its mean as it is to take a value that same distance above its mean; the bell-shape indicates that values closer to the mean are more likely, and it … Continue reading Introduction to Normal Random Variables: Overview

Hypothesis Testing: Introduction

We are now moving to the other kind of inference, hypothesis testing. We say that hypothesis testing is “the other kind” because, unlike the inferential methods we presented so far, where the goal was estimating the unknown parameter, the idea, logic and goal of hypothesis testing are quite different. In the first part of this section we will discuss the idea behind hypothesis testing, explain how it works, and introduce new terminology that emerges in this form of inference. The next two parts will be more specific and will discuss hypothesis testing for the population proportion (p), and for the population mean (μ). General … Continue reading Hypothesis Testing: Introduction

The Big Picture: Inference

Recall again the Big Picture, the four-step process that encompasses statistics: data production, exploratory data analysis, probability, and inference. We are about to start the fourth part of the process and the final section of this course, where we draw on principles learned in the other units (exploratory data analysis, producing data, and probability) in order to accomplish what has been our ultimate goal all along: use a sample to infer (or draw conclusions) about the population from which it was drawn. The specific form of inference called for depends on the type of variables involved—either a single categorical or quantitative … Continue reading The Big Picture: Inference

Conditional Probability and Independence Introduction

Introduction In the last section, we established the five basic rules of probability, which include the two restricted versions of the Addition Rule and Multiplication Rule: The Addition Rule for Disjoint Events and the Multiplication Rule for Independent Events. We have also established a General Addition Rule for which the events need not be disjoint. In order to complete our set of rules, we still require a General Multiplication Rule for which the events need not be independent. In order to establish such a rule, however, we first need to understand the important concept of conditional probability. This section will be organized as follows: We’ll first … Continue reading Conditional Probability and Independence Introduction

Sticky post

Probability A short story

Sample Spaces As we saw in the previous section, probability questions arise when we are faced with a situation that involves uncertainty. Such a situation is called a random experiment, an experiment that produces an outcome that cannot be predicted in advance (hence the uncertainty). Here are a few examples of random experiments: Toss a coin once and record whether you get heads (H) or tails (T). The possible outcomes that this random experiment can produce are: {H, T}. Toss a coin twice. The possible outcomes that this random experiment can produce are: {HH, HT, TH, TT}. Toss a coin 3 … Continue reading Probability A short story

Sticky post

Causation and Lurking Variables With simpson’s paradox

The one and only principle rule in statistics is Principle:Association does not imply causation! The scatterplot below illustrates how the number of firefighters sent to fires (X) is related to the amount of damage caused by fires (Y) in a certain city. The scatterplot clearly displays a fairly strong (slightly curved) positive relationship between the two variables. Would it, then, be reasonable to conclude that sending more firefighters to a fire causes more damage, or that the city should send fewer firefighters to a fire, in order to decrease the amount of damage done by the fire? Of course not! So what is going … Continue reading Causation and Lurking Variables With simpson’s paradox