Show
Terms in this set (77)which of the following statements is true about variables? 1.Nominal variables are expressed by numbers typically we have two kinds of variables continuous and discrete which of the following variables is NOT a discrete variable? average temperature across counties in Arizona which of the following is a discrete variable? 1.Weight of students in class number of red marbles in a jar which of the following is a continuos variable? 1.Number of "heads" when flipping three coins height of students in a class which of the following statements is true about the median and the mean? 1.The mean is the middle number in a sorted set of observations extreme observations do not affect the calculation of the median which of the following statements is true? 1.Standard deviation is the mean distance of the observations from their median mean and median are measures of center for a random variable which of the following statements is true about percentiles? 1.The 50th percentile is the mean of a distribution the 50th percentile is the median of a distribution and divides the data into two equal parts which of the following statements is true about populations and samples? 1.In most cases, we shouldn't worry about making sure our sample is randomly selected sometimes the sample average, does not agree with the population average which of the following is true about probability distributions? 1.The chi-square distribution is not related to the t-distribution a probability distribution shows the likelihood of occurrence of specific outcomes of a random variable which of the following statements is true about charts? 1.Charts can
help us understand individual variables' distributions but cannot help us understand the relationships among variables charts can help us understand individual variables distributions and relationships among variables which of the following statements is not true about scatterplots? 1.We don't need scatterplots; we can get the information we need from means, medians, and standard deviations we don't need scatterplots; we can get the information we need from the mean, median, and standard deviations which of the following statements is true about lines? 1.The standard formula for a line is y=a+bx and a and b are numbers representing the intercept and slope, respectively the standard formula for a line is y=a+bx and a and b are numbers representing the intercept and slope which of the following statements is not true about R? 1.R is an open-source programming language excel can do everything R can do R is an open source programming language. which of the following statement is NOT true about R? 1.R has versions for both Mac and Windows computers individuals cannot contribute any new code to R which of the following statements is true about the R console? 1.We cannot directly type commands into the console it is the user interface for the actual engine doing the computations which of the following statements is true about R? 1.We can either save our codes in the console or script files depending on our personal preferences I don't need to save my work space image if I save my script files often which of the following statements is true about variables in R? R will not return anything when you create a variable which of the following statements is NOT true about lists in R? 1.A list cannot be stored as a variable a list cannot be stored as a variable which of the following statements is NOT true about matrices in R? 1.We use the "matrix" command to create a matrix in R when we reference things in a matrice, we needs parentheses
which of the following statements is true about Data Frames? 1. The working directory is the location from which I can import files a dataset is organized in rows and columns and a row represents an observation which of the following is correct about importing data? 1.The working directory is the location from which I can import files the working directory is the place from which I can import files which of the following statements is correct about manipulating stored data? 1.If we would like to view the number of drivers working on the first three days in our dataset, we can simply run the command driversworking[1:3] a $ sign allows us to access variables within a dataset which of the following statements is correct about using built in functions? 1.We can take advantage of R to get the mean, median, and standard deviation of our dataset very quickly we can take advantage of R to get the mean, median, and standard deviation of our dataset very quickly which of the following statements is true about graphs and charts? 1.Graphs and charts in R help us to visualize our data so that we can see the entire distribution for a variable graphs and charts in R help us to visualize our data so that we can see the entire distribution for a variable which of the following statements is NOT true about the aggregate function? 1.We can count the observations in many groups at a time using the aggregate function
R will not allow us to aggregate over more than one variable at a time which of the following statements is NOT correct about reassigning existing values? 1. We can use the which function to see when d$holiday==1 we only need the which function to change existing values which of the following is true about creating a new variable and adding it to our dataset? 1.R will allow us to create a new variable but it will not allow the new variable to be stored in an existing dataset R will allow us to create a new variable and store it in an existing dataset For the linear regression Y=b0+b1*X+e, which of the following statements is true? 1.R is trying maximize the "e" when finding the line of best fit "e" is the distance between the actual Y and the predicted Y Which of the following statements is NOT correct? 1."lm" means linear model and this is the function that finds the line that minimizes the distance to each point simultaneously "lm" means linear model and this is the function that finds the line that maximizes the distance to each point simultaneously Which of the following statements is NOT true? 1.The size of the
errors indicate how "good" the line is as a model we can always get a perfect prediction if we try harder Which of the following statements is NOT correct? 1.Ordinary least squares does
not require any distributional assumptions For a null hypothesis of "the coefficient is 0", R has intuition about the alternative hypothesis Which of the following statements is NOT correct? 1.When we choose nominal coding, we need to assign one of the choices to be the reference When considering gender, male has to be 0 and female has to be 1 Which of the following statements is NOT correct? 1.Sum of squared errors are the errors in the model equation squared and summed up across each data point R-squared is a measure of the model fit for all kinds of models Which of the following statements is NOT true? 1.When we interpret an intercept, we plug in zero for all of the X variables The phrase "on average" is not always used when interpreting model coefficients Which of the following statements is true about multicollinearity? 1.Multicollinearity is a strong correlation between predictor variables (X) and the dependent variable (Y) in a
model Multicollinearity is a strong correlation between predictors (X variables) in a model Which of the following statements is NOT true about multicollinearity? 1.We
cannot use a model that contains multicollinearity we cannot use a model that contains multicollinearity Which of the following is correct about Assumption 1 for linear regression? 1.We can include predictor variables that are not relevant to the dependent variable in our model as long as the coefficients of other variables are significant we need a good linear model
Which of the following statements is correct about Assumption 2 in linear regression? 1.When we spot perfect multicollinearity in our model, we should remove one of the two predictor variables involved When we spot perfect multicollinearity in our model, we should remove one of the two predictor variables involved Which of the following statements is correct about Assumption 3? 1.When we use ACF method, significant correlations beyond lag=0 are ok We must use our intuition and two other tests for this assumption Which of the following statements is correct about Assumption 4? 1.Homoskedasticity means we have equal variances in the observations of the
response Homoskedasticity means we have equal variances in the observations of the response
Which of the following statements is true about Assumption 5? 1.Distributional assumptions are required for calculating the coefficients in a model Normally distributed errors are required for a linear regression model Which of the following statements is true about prediction? 1.Higher R squared is always associated with good prediction Sometimes we choose a model based on best prediction accuracy and not on R squared Which of the following statements is correct about cross-validation? 1.Cross-validation is a standard practice for validating a predictive model's performance Cross-validation is a standard practice for validating a predictive model's performance Which of the following statements is correct? 1.In-sample errors are the errors that come from predicting results of observations the model has not yet seen In-sample errors are the errors that come from predicting the values used to create the model Which of the following statements is true? 1.We should use linear regression to predict continuous outcomes and logistic regression to predict binary outcomes We should use linear regression to predict continuous outcomes and logistic regression to predict binary outcomes Which of the following statement is NOT correct about logistic regression? 1.The probabilities of
success and failure adds up to 1 in a binary logistic regression model The sum of the probabilities of success and failure could be smaller than 1 in a binary logistic regression model Which of the following statement is NOT true about odds ratios? 1.The odds ratio is the probability of success over the probability of failure The odds ratio is the probability of failure divided by the probability of success Which of the following statements is NOT true about logarithms? 1.If the odds ratio is greater than 1, the log odds ratio is positive we can take the log of any number Which of the following statements is true? 1."glm" means "generalized linear model" which is exactly the same as logistic regression Maximum Likelihood Estimation chooses is the estimation method by which we find the intercept and slope for a logistic regression model Which of the following statements is NOT correct? 1.For a logistic regression model, we can see how the predictors might change the odds of an outcome For a logistic regression model, we can see how the predictors change the outcome directly Which of the following statements is correct? 1.When a log odds ratio is greater than 0, it means that the odds ratio is greater than 1 When a log odds ratio is greater than 0, it means that the odds ratio is greater than 1 Which of the following statements is true about the logistic regression rule of thumb? 1.If the log odds ratio is 0, it means that the probability of success is 0.5 If the log odds ratio is 0, it means that the probability of success is 0.5 Which of the following statements is NOT correct about interpreting the coefficients of logistic regression? 1.If a variable has a significant positive coefficient, we can
say that the log odds ratio increases by the amount of the coefficient, on average, when the variable increases by 1 unit If a variable has a significant positive coefficient, we can say that the probability of success increases by the amount of the coefficient as the variable increases by 1 unit Which of the following statements is correct about the donations example? 1.It may be okay to have a model disagree with our intuition, because our intuition is sometimes be
flawed It may be okay to have a model disagree with our intuition, because our intuition is sometimes be flawed Which of the following statements is correct about complete information? 1.Complete information is not necessary if there is multicollinearity in the model If we do not have complete information, our model is likely to make poor predictions Which of the following statement is NOT correct? 1.If we have no complete separation, then we cannot draw a vertical line that separate all the zeros and ones No complete separation means that one or more variables classifies the observations into successes and failures perfectly Which of the following statement is NOT true about sample size? 1.Having a large sample size allows us to assume our response has a normal distribution Having a large sample size allows us to assume our response has a normal distribution Which of the following statements is true? 1.For a logistic regression
model, R-squared represents the proportion of variation in the response that is explained by the predictor variables The model fit statistics in the R output for logistic regressions are generated during maximum likelihood estimation process Which of the following statements is true? 1.We may have a probability of success that is greater than 1 Things that make "k" (the entire right hand side) more positive will increase the likelihood of success Which of the following statements is NOT correct? 1.We can use different classification rules for different logistic regression models If p > .5 in any modeling situation, we should classify the prediction as "success" Which of the following statements is NOT true about confusion matrices? 1.We can only create a confusion matrix for our training set, we cannot create a confusion matrix for our test set We can only create a confusion matrix for our training set, we cannot create a confusion matrix for our test set Which of the following statements is correct? 1.Stability
relates to how often we make errors and what kind of errors we make A good model (of any type) will have both accuracy and stability Which of the following statements is correct about overfitting? 1.Tackling an overfitting problem can help improve a model's stability Tackling an overfitting problem can help improve a model's stability Which of the following statements is true? 1.When we add a new variable to a linear regression, as long as the R-squared increases by at least a tiny decimal, we should keep that variable We use variable selection techniques to help remove unnecessary variables in our model and reduce overfitting Which of the following statements is NOT correct? 1.We always remove variables that contribute less than
5% to the model's R-squared We always remove variables that contribute less than 5% to the model's R-squared Which of the following statements is NOT true? 1.Even after the application of a variable selection technique, underfitting is still a possibility Always keep the model as complicated as possible, keeping all variables with significant coefficients Which of the following statements is correct? 1.Outliers are always observations that don't make sense in the context of the problem We can identify outliers visually by looking at a histogram or scatter plot for both Y variables and X variables Which of the following statements is NOT true about outliers? 1.Using a scatter plot to visualize the relationship between distance and fare would not help us to identify outliers Using a scatter plot to visualize the relationship between distance and fare would not help us to identify outliers Which of the following statements is NOT true? 1.When we edit an imported dataset in R, we are only editing its copy inside R, NOT the original file There are clear, straightforward rules about which outliers to remove and how to do it Which of the following statements is true about interactions? 1.Interpreting interactions is always very
straightforward Interactions exist when X variables affect each other's influence on Y Which of the following statement is NOT true? 1.Interaction terms are always significant Interaction terms are always significant. Which of the following statements is NOT true? 1.An interaction term
can contain any combination of discrete and continuous variables In order to form an interaction term, one of the variable has to be binary and the other has to be continuous Which of the following statements is true about decision trees? 1.A parent node can have more than 2 children A decision tree is a list of rules for systematic decision making Which of the following statements is true? 1.The algorithm finds a single variable and multiple values in that variable that best divide the observations at each step Every time the algorithm splits the data in a node, it considers all variables at all possible values for its split location Which of the following statements is correct? 1.Using recursive partitioning, we are dividing the data into smaller sets, with members in the same set being wildly
different Using recursive partitioning, we are dividing the data into smaller sets, each with members sharing similar characteristics Which of the following statements is true? 1.For a classification tree we use "method=anova" and for a regression tree we use "method=class" ANOVA means analysis of variance Which of the following statements is correct? 1.Decision trees are always better because they don't require assumptions for decision trees, overfitting is a common problem
Students also viewedcomp sci70 terms alethea_straker SQL Second Semester83 terms Harami1 20.1.1 Primitive Types Quiz25 terms persaud_2021 13 Quiz16 terms Sabrina188 Sets found in the same folderMISY262 Final Exam89 terms jackiebattistaa MISY Final58 terms eprosen91 Other sets by this creatorfinc31463 terms camy_hintonPlus University of Delaware MISY 160 Exam 3 Study Guide285 terms camy_hintonPlus misy tia vocab words245 terms camy_hintonPlus Geology Quizzes III76 terms camy_hintonPlus Verified questions
algebra Calculate the length of time for each flight. $$ \begin{aligned} \begin{array}{lccc} \textbf{Departure} &&&& \textbf{Arrival}\\ 7:50\ \text{a} &&&& 5:00\ \text{p} \end{array} \end{aligned} $$ Verified answer
politics of the united states Which of the following best describes Congress's use of the commerce clause over time? (A) Congress has used it to protect workers and the environment. (B) The Supreme Court has denied Congress much of its commercial regulation authority. (C) Congress can legislate only on products that involve interstate commerce. (D) Congress has used its commerce power sparingly and there are few federal commercial laws. Verified answer
politics of the united states Do you think that U.S. law should be changed so that defendants are required to testify in criminal cases? Explain. Verified answer finance Return on assets and return on equity are examples of which type of ratio? Verified answer Recommended textbook solutions
Century 21 Accounting: General Journal11th EditionClaudia Bienias Gilbertson, Debra Gentene, Mark W Lehman 1,012 solutions
Introductory Business Statistics1st EditionAlexander Holmes, Barbara Illowsky, Susan Dean 2,174 solutions
Fundamentals of Engineering Economic Analysis1st EditionDavid Besanko, Mark Shanley, Scott Schaefer 215 solutions
Principles of Economics7th EditionN. Gregory Mankiw 1,394 solutions Other Quizlet setsMGT 3100 Exam 2 multiple choice15 terms sleopol Topic 2: Correlation and SLR45 terms winnhaley ISU Stat 301 Statistics Exam 1 Study Guide21 terms jrsandholm Stats final15 terms emery_lowden Which of the following statement is true about a variable?A variable cannot be subtracted. A variable represents a quantity. so X is representing a quantity which is 1. So the statement is TRUE.
What is not true about variables in statistics?What is NOT true about variables? the type of variable does not determine the types of statistical analysis.
Which of the following two statements are true about variables?Which of the following two statements are true about variables? Variables will be ignored by compiler. The value assigned to a variable may never change. They allow code to be edited more efficiently.
Which of the following statement is not true for values?The correct answer is Option E) There is no substantial variability in values across cultures.
|