EEB330 – Precept 08: Stats in Python

Author: Michelle White

Date: November 7, 2024

Output: html

GitHub Assignment Link

Exercise 1: Hypothesis Testing with scipy

Example from geeksforgeeks.org.

Instructions:

  1. Assess the homogeneity of variance in the sample datasets
  2. Determine if the sample data is approximately normally distributed
  3. Make sure there are no significant outliers in the sample data
  4. Perform a two-sample t-test on the sample data and report the p-value

Deliverables:

# import the necessary libraries
import numpy as np
import scipy.stats as stats

# create sample data
data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,
                        17, 16, 14, 19, 20, 21, 15,
                        15, 16, 16, 13, 14, 12])

data_group2 = np.array([15, 17, 14, 17, 14, 8, 12,
                        19, 19, 14, 17, 22, 24, 16,
                        13, 16, 13, 18, 15, 13])
# Your code here!

Exercise 2: Fitting Curves with scipy

Example from geeksforgeeks.org.

Instructions:

  1. Create a function sine_fit that accepts an independent variable (your data), the amplitude, and the phase shift of a sine wave and returns the resulting y-value
  2. Call scipy's curve_fit to estimate the parameters of your sample data (Hint: the first argument is your callable sine_fit function)
  3. Plot the sample data in red and overlay the fitted curve as a dashed blue line

Deliverables:

# import the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# create sample data
x = np.linspace(0, 10, num = 40)
y = 3.45 * np.sin(1.334 * x) + np.random.normal(size = 40)
# Your code here!

Exercise 3: Sensitivity Analysis with sensitivity

Example from the Sensitivity Analysis Documentation.

Instructions:

  1. Create a Python dictionary where x1 and x2 are the keys
  2. Use SensitivityAnalyzer and my_model to obtain a DataFrame and hexbin plot of the results
  3. Add a third key x3 with values [5, 10, 15] to your dictionary
  4. Analyze and plot the pairwise sensitivities of all three variables using my_model2

Deliverables:

# import the necessary libraries
import pandas as pd
from sensitivity import SensitivityAnalyzer

# define some functions for known models
def my_model(x1, x2):
    return x1 ** x2

def my_model2(x1, x2, x3):
    return x1 * x2 ** x3

# create sample data
x1_vals = [10, 20, 30]
x2_vals = [1, 2, 3]