数据分析代写|STAT 4220 2022 Fall Final DA Exam
AUTHOR
essaygo
PUBLISHED ON:
2022年12月21日
PUBLISHED IN:

这是一篇来自美国的关于数据分析相关的最终考试代写

 

This exam contains 4 pages (including this cover page) and 4 problems. The total points for this project is 21, and it accounts for 21% of your final grade.

No collaborations for this final project. You must work all by yourself.

You are required to submit the report in pdf and your R-script (or RMD file).

You are required to show your work on each open problem. The following rules apply:

Organize your work, in a reasonably neat and coherent way. Work scattered all over the page without a clear ordering will receive very little credit.

Mysterious or unsupported answers will not receive full credit. Your must include appropriate R outputs in the report. You must include thorough discussions/explanations for your conclusions.

  1. (4 points) Now consider an unbalanced completely randomized design. Some semiconductor circuits were placed in a hot chamber and we were interested in finding how long the circuit could work without failure. There were five levels of the factor (temperature) and the response was the time taken for the first failure to occur. Some scientists believe that as the temperature went higher, some samples would be destroyed and that resulted in the following table:

 

This data is given in q1.txt. Please analyze this data and judge whether the scientists’ belief is reasonable or not. Hint: you can use the same way to code this unbalanced completely randomized design as the completely randomized design you learned.

 

  1. (4 points) In a wool textile experiment, an unreplicated three-way layout was employed to study the effect on y, the number of cycles to failure, of the lengths of worsted yarn under cycles of repeated loading. Three factors are A, length of test specimen (250, 300, 350 mm); B, amplitude of loading cycle (8, 9, 10 mm); and C, load (40, 45, 50 g). The data (Cox and Snell, 1981) are given in the following table and also inq2.txt.

Please build an appropriate linear model with only main effects and two-factor interactions.

If researchers are interested in the predictive responses for the following case: length = 250,     amplitude = 10 and load = 45. Please provide the prediction.

  1. A 16 runs half-fraction design with two levels was used to study a biological system with Herpes simplex virus type 1 (HSV-1) and five antiviral drugs: Interferon-alpha (A), Interferon-beta

(B), Interferon-gamma (C), Ribavirin (D), and Acyclovir (E). For each drug, two dosage levels

were studied: the high level (+1) was determined by the minimum effective dosage at which the drug’s antiviral effect reached plateau and the low level (-1) corresponded to no drug used. The following table shows the actual dosages used. Cell culture was prepared before the experiment.

Virus and drugs were added simultaneously during the experiment manually. Two researchers conducted the experiment independently using the same cell culture, yielding two replicates.

The observed data, readout, were the percentage of infected cells after the combination drug treatment. The data is in q3.txt.

(a) (3 points) Which drugs are effective for treating the HSV-1? Which drug combinations are optimal?

(b) (2 points) As Ribavirin (D) is known to be very toxic, if the researchers do not want to include D, what will be the most effective drug combination?

 

  1. A 16-run fractional factorial experiment in 10 factors on sand-casting of engine manifolds was conducted by engineers at the Essex Aluminum Plant of the Ford Motor Company and described in the article ”Evaporative Cast Process 3.0 Liter Intake Manifold Poor Sandfill Study,” by D.

Becknell (Fourth Symposium on Taguchi Methods, American Supplier Institute, Dearborn, MI,1986, pp. 120-130). The purpose was to determine which of 10 factors has an effect on the proportion of defective castings. You can assume all three and higher order interactions are negligible in this study if needed. The design and the resulting proportion of nondefective castings pb observed on each run are shown below.

(a) (1 point) What is the resolution of this design.

(b) (2 points) What are the generators in this design.

(c) (2 points) Based on the response pb, find the best model possible. Is this model appropriate?

(d) (2 points) How about using arcsinp pb as the response? Would you able to find a better model?

You may also like:
扫描二维码或者
添加微信skygpa