Assignment Detail


Write a general Cohen d function to be more generally useful

    Assignment Instructions

    Assignment Help >> Other Subject

    Control Structures Assignment –
    There are six exercises. You are required to provide solutions for at least four of the five. You are required to solve at least one exercise in R, and at least one in SAS.
    Exercise 1 –
    Write a general Cohen d function to be more generally useful, accepting a wider range of arguments. For convenience, name this general.d.
    The new function should accept two parameters, m, s
    In your function, check for these conditions:

    • If m is of length 1 and s is length 1, then simply divide m/s – that is, proceed with the calculations as if m = %Diff and s = CV.
    • If m is of length 2, then calculate the difference and proceed with the calculations.
    • If m is of length greater than 2, find the difference between the min and max of m and proceed with the calculations.
    • If s is of length greater than 1 calculate pooled sd as

    s2pooled = √(iksi2/k)
    Exercise 2 –
    Previously, we’ve calculated required replicates based on the z distribution. In this exercise, you will calculate required replicates based on the t distribution. You must implement one of two algorithms given below. For both algorithms, calculate degrees of freedom as ν = n ∗ k – k where n is the current estimate for required replicates and let k = 2
    Algorithm 1 (from Cochran and Cox, Experimental Design)
    Use the formula:
    n ≥ 2 × (CV/%Diff)2 × (tα/2,ν + tβ,ν)2
    1. Start with a small n, say, 2.
    2. Calculate critical tα/2 and tβ quintiles with ν d.f, then calculate required replicates. Label this ncurrent.
    3. Update ν using ncurrent, then recalculate critical values and required replicates. Label this nnext.
    4. If ncurrent = nnext then the algorithm has converged. Otherwise, set ncurrent to nnext, and repeat 2-3.
    5. If after some sufficiently large number (say, 20), the algorithm hasn’t converged, print a message and return the largest of ncurrent and nnext
    Algorithm 2 –
    1. Start with a small n, say, 2.
    2. Calculate critical tα quantile using the central t distribution with ν d.f.
    3. Estimate Type II error (p-value) under the alternate hypothesis using the non-central t distribution with ν d.f, at the critical t from 2. Calculate non-centrality parameter as
    NCP = %Diff/CV √(n/2)
    4. If the resulting error is less than 1 – β, accept the current value of n. Otherwise increment n and repeat 2-3.
    5. If desired power is not achieved after a large number of iterations (say, 1000), terminate the calculations and return NA.
    Implement the algorithm as a function or macro named required.replicates.t, with parameters mu, sigma and an optional parameter k. Test your function by comparing with required replicates from prior exercises for calories per serving, 1936 versus 2006, 1936 vs 1997 and 1997 vs 2006.
    For either algorithm, you might consider starting with an initial value of n calculated using the z critical values as before. Can you be certain that the z formula will not estimate more required replicates than the t algorithm?
    Exercise 3 –
    Calculate a cumulative probability value from the normal pdf, using the Newton-Cotes formula
    x_0x_nf(x)dx ≈ i=0nhf(xi)
    where x1, …, xn are a sequence of evenly spaced numbers from -2 . . . 2, with xi = x0 + hi, n is the number of xi in the sequence and step size h = (xn – x0)/n.
    We will calculate this integral by calculating successive approximations of f = L(x; 0, 1) = norm.pdf over series of x with increasingly smaller step sizes.
    Part a – Calculate L0 by summing over L(X0), where X0 is a series from x0 = -2, . . . , xn = 2 incremented by h0 = 0.1. Multiply this sum by h0 for an approximate x_0x_nL(x)dx.
    Think of this as the sum of a series of rectangles, each h wide and a height given by the normal pdf.
    Part b – Create a second series X1 by setting h1 = h0/2. Compute L1 from this series as in part a. Let i = 1 You now have the are of twice as many rectangles as part a, but each is half as wide.
    Part c – Compute δ = |Li -Li-1|. If δ < 0.0001, your sequence of iterations has converged on a solution for L. Finish with Part d. Otherwise, increment i, let hi = hi-1/2. Create the next series Xi and compute the next Li.
    Hint: code this first as a for loop of a small number of i until you know your code will converge toward a solution.
    Part d – Report i, n and h.
    To check your results, compare your final Li to
    pnorm(-2, lower.tail = TRUE)-pnorm(-2, lower.tail = TRUE)
    ## [1] 0
    Is your estimate within 0.0001 of this value?
    You might find it useful to produce staircase plots for the first 2-4 iterations (plot Li vs Xi on one graph). You might also find it interesting to plot δ or L versus i or h. You can create vectors to hold the intermediate steps – 10 iterations should be enough. How many iterations might it take to get within 0.000001 of the expected value from R?
    Exercise 4 –
    Part a – Write a function to compute mean, standard deviation, skewness and kurtosis from a single vector of numeric values. You can use the built-in mean function, but must use one (and only one) for loop to compute the rest. Be sure to include a check for missing values. Note that computationally efficient implementations of moments calculations take advantage of (Yi – Y¯)4 = (Yi – Y¯) × (Yi – Y¯)3, etc.
    Your function should return a list with Mean, SD, Skewness and Kurtosis. If you use IML, you will need to implement this as a subroutie and use call by reference; include these variables in parameter list.
    Part b – Test your function by computing moments for Price from pumpkins.csv, for ELO from elo.csv or the combine observations from SiRstvt. If find that ELO shows both skewness and kurtosis, Price is kurtotic but not skewed, while SiRstvt are approximately normal.
    If you wish, compare your function results with the skewness and kurtosis in the moments package. This package also implements test of significance for skewness and kurtosis.
    Exercise 5 –
    In this exercise, we will use run-time profiling and timing to compare the speed of execution for different functions or calculations. In the general, the algorithm will be
    1. Write a loop to execute a large number of iterations. I find 106 to be useful; you might start with a smaller number as you develop your code.
    2. In this loop, call a function or perform a calculation. You don’t need to use or print the results, just assign the result to a local variable.
    3. Repeat 1 and 2, but with a different function or formula.
    4. Repeat steps 1-3 10 times, saving the time of execution for each pair of the 10 tests. Calculate mean, standard deviation and effect size for the two methods tested.
    If you choose R, I’ve included framework code using Rprof; I’ve included framework code for IML in the SAS template.
    Test options – In homework, you were given two formula for the Poisson pmf,
    f(x; λ) = eλx/x!
    = exp(-λ)(1/x!)exp[x × log(λ)]
    Compare the computationally efficiency of these two formula.

    • Create a sequence x of numbers -3 to 3 of length 106 or so. In the first test, determine the among of time it takes to compute 105 estimates of norm.pdf by visiting each element of x in a loop. In the second test, simply pass x as an argument to norm.pdf. Does R or IML optimize vector operations?
    • The mathematical statement √x can be coded as either sqrt(x) or xˆ(1/2). Similarly, ex can be written as exp(1)x or exp(x). These pairs are mathematically equivalent, but are they computationally equivalent. Write two test loops to compare formula with either √x or ex of some form (the normal pdf, perhaps).

    Exercise 6 –
    Write an improved Poisson pmf function, call this function smart.pois, using the same parameters x and lamba as before, but check x for the following conditions. 1. If x is negative, return a missing value (NA,.). 2. If x is non-integer, truncate x then proceed. 3. If x is too large for the factorial function, return the smallest possible numeric value for your machine. What x is too large? You could test the return value of factorial against Inf.
    You can reuse previously tested code writing this function as a wrapper for a previously written pois.pmf and call that function only after testing the for specified conditions.
    Test this function by repeating the plots from Homework 4, Ex 4. How is the function different than dpois?
    Warning You may not be able to call this new function exactly as in the last exercise (Hint – what are the rules for conditions in if statements?). Instead, you might need to create a matrix or data table and use apply functions, or write a loop for visit each element in a vector of x.
    Note – Just do 4 exercises in R and 1 in SAS.
    Attachment:- Control Structures Assignment Files.rar

    Need fresh solution to this Assignment without plagiarism?? Get Quote Now

    Expert Answer

    Asked by: Anonymous
    Plagiarism Checked
    Answer Rating:

    Plagiarism free Answer files are strictly restricted for download to the student who originally posted this question.

    Related Assignments

    Our customer support team is here to answer your questions. You can send Assignments directly to support team.
    👋 Hi, how can I help?