SAS Interview Questions and Answers
1. What is SAS?
SAS (Statistical Analytics System)
- SAS is a software suite for advanced analytics, multivariate analyses, business intelligence, data management, and predictive analytics
- It is developed by SAS Institute.
- SAS provides a graphical point-and-click user interface for non-technical users and more advanced options through the SAS language.
2. What are the features of SAS?
Answer: The following are the features of SAS:
Figure:SAS Interview Questions – Features of SAS
- Business Solutions: SAS provides business analysis that can be used as business products for various companies to use.
- Analytics: SAS is the market leader in the analytics of various business products and services.
- Data Access & Management: SAS can also be used as a DBMS software.
- Reporting & Graphics: Hello SAS helps to visualize the analysis in the form of summaries, lists, and graphic reports.
- Visualization: We can visualize the reports in the form of graphs ranging from simple scatter plots and bar charts to complex multi-page classification panels.
3. What are the reasons for choosing SAS over other data analytics tools?
Following are some of the reasons for choosing SAS over other data analytics software tools:
- Unlike other data analytics tools, SAS is more professional and comparatively easy to learn and use, especially for users having familiarity with SQL.
- Although SAS provides limited options for customization, it offers sufficient graphical functionality.
- SAS streamlines the process of storing and managing large amounts of data in an organized manner
- There are fewer chances of errors as SAS is licensed software that releases updates in a controlled environment. Its features are thoroughly tested.
- SAS offers enterprise-grade security in terms of data privacy.
- SAS offers excellent customer service and technical support. Users receive immediate support whenever they face technical challenges during the installation process.
4. List down a few capabilities of the SAS Framework.
The following are the four capabilities of the SAS Framework:
- Data Accessibility – SAS enables users to avail data from varied sources such as Oracle databases, excel files, SAS datasets, raw databases, etc.
- Data Management – SAS facilitates the generation of beneficial insights by managing previously accessed data. It manages data by creating variables and subsets, cleaning and validating data, etc.
- Data Analysis – SAS further enables users to perform statistical analysis on the managed data. It supports both simple evaluations such as frequency and averages along with complicated evaluations like regression, forecasting, etc.
- Data Presentation – SAS permits the storage of the analyzed data in a data file or as a graphic report, a list, or a summary report, which then can be printed or published.
5. Mention a few capabilities of the SAS Framework.
The following are the four capabilities in SAS Framework:
Figure: SAS Interview Questions – SAS Framework
- Access: As we can learn from the figure, SAS allows us to access data from multiple sources like an Excel file, raw database, Oracle database, and SAS Datasets.
- Manage: We can then manage this data to subset data, create variables, and validate and clean data.
- Analyze: Further, analysis happens on this data. We can perform simple analyses like frequency and averages and complex analyses including regression and forecasting. SAS is the gold standard for statistical analyses.
- Present: Finally we can present our analysis in the form of a list, summary, and graphic reports. We can either print these reports, write them to data files or publish them online.
6. Provide some examples where the defaults of PROC REPORT are different from PROC PRINT’s dEFAULTS.
- Absence of Record Numbers in Proc Report.
- Usage of labels as headers in Proc Report
- REPORT requiring NOWINDOWS option.
7. What is the function of the output statement in a SAS Program?
You can use the OUTPUT statement to save summary statistics in a SAS data set. This information can then be used to create customized reports or to save historical information about a process.
You can use options in the OUTPUT statement to
- Specify the statistics to save in the output data set,
- Specify the name of the output data set, and
- Compute and save percentiles not automatically computed by the CAPABILITY procedure.
8. State the difference between the BY statement and CLASS statement in proc.
- Contrary to the CLASS statement, the BY statement necessitates sorting or indexing of data in the order of BY variables.
- BY group results and CLASS group results have different layouts.
9. List down the fundamental features of SAS.
A few key features of SAS include:
- Business Solutions – The business analysis offered by SAS can be used by different companies as business products
- Analytics – SAS has emerged as one of the leaders in the market of business products and services analytics.
- Data Management and Accessibility – SAS also offers the benefits of DBMS software.
- Data Reporting and Graphics – SAS enables users to present data in the form of lists, summaries, and graphic reports.
- Visualization – Users are allowed to visualize reports in the form of multiple graphs, including but not limited to common bar charts and scatter plots to multi-page classification panels.
10. What is the difference between using the drop = data set option in the data statement and the set statement?
If you don’t want to process certain variables and you do not want them to appear in the new data set, then specify the drop = data set the option in the set statement.
Whereas I want to process certain variables and do not want them to appear in the new data set, then specify the drop = data set the option in the data statement.
11. What is the function of a Stop statement in a SAS Program?
Stop statement causes SAS to stop processing the current data step immediately and resume processing the statement after the end of the current data step.
12. What is the difference between reading data from an external file and reading data from an existing data set?
The main difference is that while reading an existing data set with the SET statement, SAS retains the values of the variables from one observation to the next. Whereas when reading the data from an external file, only the observations are read. The variables will have to be re-declared if they need to be used
13. Explain the function of the stop statement in a SAS program.
The stop statement immediately stops SAS from processing the current data set and causes it to resume processing after the end of the current data step.
14. How do specify variables to be processed by the FREQ procedure?
TABLES statements can be used to specify variables to be processed by the FREQ procedure.
15. State the difference between using the drop = data set option in the data statement and set statement.
- The drop = data set the option in the set statement is specified when both the processing and the appearance of certain variables in the new data set are undesired.
- The drop = data set the option in the data statement is specified when the processing of certain variables is desired, but its appearance in the new data set is undesired.
16. How many data types are there in SAS?
There are two data types in SAS. Character and Numeric. Apart from this, dates are also considered as characters although there are implicit functions to work upon dates.
17. What are the differences between PROC MEANS and PROC SUMMARY?
PROC MEANS produces subgroup statistics only when a BY statement is used and the input data has been previously sorted (using PROC SORT) by the BY variables.
PROC SUMMARY automatically produces statistics for all subgroups, giving you all the information in one run that you would get by repeatedly sorting a data set by the variables that define each subgroup and running PROC MEANS. PROC SUMMARY does not produce any information in your output. So you will need to use the OUTPUT statement to create a new DATA SET and use PROC PRINT to see the computed statistics.
18. Mention some of the common programming errors in SAS.
Listed below are some of the very common mistakes that individuals make while writing programs in SAS:
- Missing semicolon – SAS is likely to misinterpret not only the statement missing the semicolon but also numerous following statements.
- Unclosed quotes and comments – Unclosed quotes and unclosed comments might negatively affect SAS’s reading of the subsequent statements and give rise to multiple errors.
- Unmatched quotation marks – The quotation marks must be matched.
- Unsorted data – Data must be sorted before using a statement that necessitates a sort.
- Unchecked submitted programs – Submitted programs must be checked for log entries.
- Invalidity – Invalidity of either the dataset option or the statement option.
- Not using debugging techniques – Users must use debugging techniques.
19. What are the different ways of creating micro variables in SAS programming?
Following are some of the different ways of creating micro variables in SAS programming:
- %Global
- Macro Parameters
- %Let
- Call Symput
- Proc SQL into clause
20. Briefly explain the Input and Put functions.
Input function – character to numeric conversion – Input (source, informat)
Put function – numeric to character conversion – Put (source, format)
21. Name a few SAS functions.
Substr, Scan, Catx, trim, tranwrd, find, Sum, Index.
22. What are some of the SAS system options used for debugging SAS micros?
Multiple SAS system options can be used for troubleshooting macro problems. The SAS log automatically shows the Macro-option results.
- MEMRPT – Shows memory usage statistics.
- MLOGIC – Detects and shows micrologic.
- MERROR – A warning is issued whenever a user attempts to invoke a macro that cannot be identified by SAS. The warning message is displayed in case of a misspelling or an undefined macro.
- SYMBOLGEN – Prints a message in the LOG file whenever a macro variable is resolved, stating the resolving process of the macro variable.
- MPRINT – Displays all the SAS statements of the resolved macro code.
23. How many data types are available in SAS?
There are two types of data in SAS, namely Character and Numeric. Dates are also considered as characters even though there are suggested functions to work upon dates.
24. Give an example where SAS fails to convert character value to numeric value automatically.
Suppose the value of a variable PayRate begins with a dollar sign ($). When SAS tries to automatically convert the values of PayRate to numeric values, the dollar sign blocks the process. The values cannot be converted to numeric values.
Therefore, it is always best to include INPUT and PUT functions in your programs when conversions occur.
25. Can a variable be a character data type if it only contains numbers?
Yes, it will depend upon the use of the variable. The number can be used as a categorical value rather than a quantity. For example, the ID of a particular table or a phone number contains numbers that do not represent any quantity.
26. Can a variable be a numeric data type if it contains letters or special characters?
No, it will be a character data type.
27. What can be the size of the largest dataset in SAS?
SAS datasets, prior to SAS 9.1, could store up to 32,767 variables. In SAS 9.1, the number of observatories will depend upon the computer’s capacity to manage and store them.
28. What is the difference between PROC MEANS and PROC SUMMARY?
PROC MEANS and PROC SUMMARY are similar techniques for calculating mean, median, count, sum, and other descriptive statistics along with metrics such as percentiles, variances, quartiles, etc.
Following are the two major differences between PROC MEANS and PROC SUMMARY:
- Output – By default, PROC MEANS prints output in the listing window or any other open destination. PROC SUMMARY prints to the output window when the PROC SUMMARY statement includes the print option.
- Numerical variables – While PROC MEANS considers all the numerical variables in the statistical analysis, PROC SUMMARY takes into account all the numerical variables defined in the VAR statement in the statistical analysis.
29. What are _N_ and _ERROR_ in SAS?
A SAS Data Step contains two automatically created variables, namely, the _N_ variable and the _ERROR_ variable.
- _N_ – This variable monitors the number of times a data step is repeated. By default, the value is set to 1. Whenever the data step of a data statement is repeated, the value increases.
- ERROR – This variable identifies errors such as input data error, math error, conversion error, etc., during execution. By default, the value is set to 0.
30. How does PROC SQL work?
PROC SQL is a simultaneous process for all the observations. The following steps happen when PROC SQL is executed:
- SAS scans each statement in the SQL procedure and checks syntax errors, such as missing semicolons and invalid statements.
- SQL optimizer scans the query inside the statement. The SQL Optimizer decides how the SQL query should be executed in order to minimize run time.
- Any tables in the FROMstatement are loaded into the data engine where they can then be accessed in memory.
- Code and Calculations are executed.
- The final Table is created in memory.
- The final Table is sent to the output table described in the SQL statement.
31. What are SAS functions and procedures?
SAS functions – SAS has several built-in functions for facilitating data processing and analysis. Different numbers of arguments are addressed by different functions. Here is a list of SAS functions:
- SCAN()
- COUNTC()
- COMPRESS()
- NPUT()
- SUBSTR(), etc.
SAS procedures – SAS procedures facilitate data processing in SAS data sets for creating tables, reports, charts, statistics, etc, and performing other data operations and analysis. Following are some of the SAS PROCs (procedures):
- PROC MEAN
- PROC SQL
- PROC SORT
- PROC FREQ
- PROC REPORT, etc
32. What is the length assigned to the target variable by the scan function?
200
33. Name a few SAS functions.
Scan, Substr, trim, Catx, Index, tranwrd, find, Sum.
34. What do you mean by the APPEND process?
The word “append” refers to the last addition. Adding one SAS data set to another SAS data set is what the APPEND operation in SAS does.
35. What is the work of tranwrd function?
TRANWRD function replaces or removes all occurrences of a pattern of characters within a character string.
36. What accomplishes a CALENDAR procedure?
The CALENDAR technique displays data from a SAS data source in a monthly calendar format.
37. Describe the BMDP process.
The BMPD method is applied to data analysis.
38. What SAS functions are utilized for processing characters?
SAS uses the character functions, often known as UPCASE and LOWERCASE, to handle characters.
39. Describe the call exchange procedure.
Pattern matching replacement is carried out using the CALL PRXCHANGE function.
40. Describe how the BOR works.
The BOR function returns a bitwise logical OR between two statements. It is a bitwise logical operation.
41. State RUN-group Processing definition.
The RUN statement is used to submit a PROC step while keeping the procedure running.
42. Why is double trailing @@ used in input statements, please?
In order to execute the following input statement, SAS should hold onto the current record rather than switching to the new record when double trailing @@ is used in input statements during data step iteration.
43. How well-versed are you in the SAS data set?
The data that may be analyzed using a SAS application is often referred to as a SAS data set. The SAS data table is another name for the SAS dataset.
Two sections make up a SAS data table:
- rows of variable columns
- observations in rows
The SAS data set’s pertinent details may be summed up as follows:
- In addition to having built-in data sources for use like Excel, Access, etc., SAS Dataset can also read.
- A temporary Dataset is a dataset that is only utilized for the current session run and is deleted when the session concludes.
- The Permanent Dataset is another name for the Dataset that is saved for use in a subsequent session.
44. How many observations would be there at the end of the data step execution?
12
45. What do you mean when you say the SAS Scan function is used?
Typically, words are extracted from a value delimited by delimiters using the Scan() method (characters or special signs that separate words in a text string). Using text or variables containing text, the SCAN function picks out certain words and puts them in new variables.
46. Describe the purpose of VFORMATX.
The format associated with the value of a given statement is returned by the VFORMATX function.
47. What is the STD function?
The standard deviation for the nonmissing statements will be returned using the STD function.
48. What is the difference between do while and do until?
An important difference between the DO UNTIL and DO WHILE statements are that the DO WHILE expression is evaluated at the top of the DO loop. If the expression is false the first time it is evaluated, then the DO loop never executes. Whereas DO UNTIL executes at least once.
49. How do you use the do loop if you don’t know how many times you should execute the do loop?
We can use ‘do until’ or ‘do while’ to specify the condition.
50. What use does $BASE64X serve?
The character data is encoded into ASCII text using $BASE64X.
51. Which SAS command does not automatically convert values while doing comparisons?
Automatic conversions are not possible with WHERE statements, since the data set, contains WHERE statement variables.
52. What distinguishes the NODUPKEY and NODUP choices?
The NODUP option verifies and eliminates duplicate observations. The NODUPKEY option, on the other hand, searches for all BY variable values and, if any are found, eliminates them.
53. If a variable contains only numbers, can it be a character data type?
Yes, it depends on how you use the variable. There are some numbers we will want to use as categorical values rather than a quantity. An example of this can be a variable called “Foreigner” where the observations have the value “0” or “1” representing not a foreigner and foreigner respectively. Similarly, the ID of a particular table can be in number but does not specifically represent any quantity. Phone numbers are another popular example.
54. How should the SAS software be validated properly?
The OPTIONS OBS=0 must be placed at the beginning of the code, but if you wish to run it, a log will appear, which will be shown by the colors that are highlighted.
55. If a variable contains letters or special characters, can it be a numeric data type?
No, it must be a character data type
56. What distinguishes reading data from an existing dataset from reading data from an external file?
The primary distinction is that SAS preserves the values of the variables from one observation to the next when reading an existing data set with the SET command. Only the observations are read when the data are read from an external file. If the variables are utilized, they must be defined again.
57. Could you clarify the distinction between PROC MEANS and SUMMARY?
When a BY statement is used and the input data has already been thoroughly sorted out using BY variables, a subgroup statistician is produced under the PROC MEANS.
There is a statistics that is automatically generated under the PROC SUMMARY for each subgroup. It offers a wide range of running information. The data set would undergo the best sorting, which is then constructed with the aid of factors that substantially define each subgroup and execute PROC MEANS.
58. How many other Sas products have you used and do you consider yourself proficient with them?
Proc means, proc tabulate, proc report, proc print, proc freq, proc univariate, etc.
59. What is the meaning of the ‘Of’ there in X = Sum (Of A1-A4, A6, A9);
It cannot be understood the way we want if the OF function is not used. This function computes a1 minus a4 plus a6 and a9, rather than summarising a1 to a4 & a6 & a9. It likewise holds true for the mean choice.
60. What Functions Do Put and Input?
Character data values are converted to numeric values using the INPUT function.
The PUT function changes character values from numeric values.
61. What are some statements that are only understood during compilation?
drop, rename, maintain, format, label, attrib, informat, by, where, length, retain, and array are all valid operations.
62. What statements are the only ones that are carried out?
procedures for INPUT, INFILE, OUTPUT, and CALL
63. Which statements should go where in the DATA phase, according to experts?
Data input, files, running cards, where label, choose information format, and format
64. Give examples of statements that work both during compilation and execution.
Options, a title, and a footnote
65. List the many data types that SAS can handle.
Character and numeric data types are the only two that SAS supports. Despite the fact that there are implicit functions that may be applied to dates, they are still regarded as characters.
66. What does the tranwrd function do?
Using the TRANWRD function, characters or patterns that appear inside a character string can be replaced or removed.
67. Give a SAS definition of the data phase.
A SAS dataset in the form of the Data step in SAS contains both the data and the “data dictionary.” The main purpose of the data dictionary is to keep all the details of variables and their properties.
68. Describe Base SAS.
An older-looking text-based, rudimentary IDE is called Base SAS. A more GUI-like IDE with wizards to help with developing code for various operations is called Enterprise Guide (EG).
69. What are the various approaches to include or exclude particular variables from a dataset?
You may use the DROP and KEEP commands to include or omit particular variables from a data collection.
- Drop Statement: These tell SAS which variables to take out of the data set.
- Keep Statement: This statement identifies the variables in the data collection that should be kept.
70. Why would you use the trace option?
When a process creates many output objects, the names of each may be determined using ODS Trace.
ODS TRACE ON;
ODS TRACE OFF;
71. What distinguishes a macro variable?
Using the apostrophe (&) symbol
72. What distinguishes %LOCAL from %GLOBAL?
% An internal macro variable is known as local. % A macro variable called global is specified in open code (outside the macro or can use anywhere).
73. What would constitute a macro’s end?
The %Mend Statement determines when the macro ends.
74. Distinguish between INPUT and INFILE.
In contrast to the INPUT statement, which specifies the variables utilized, an INFILE statement in SAS programming indicates an external file that contains the data.
75. Which SAS software command is used to achieve sorting?
To do sorting on either a single variable or a number of variables, use the PROC SORT command. This command is used on the dataset when sorting results in the creation of a new data set while leaving the old data set alone.
76. How are PROC print and PROC contents used?
The SAS program’s PROC step is used to launch built-in analytical processes for the dataset’s data.
77. Which statement in SAS does not automatically conduct conversions in comparisons?
The “where” statement does not automatically convert values when used in comparisons in SAS.
78. Explain the distinction between nodupkey and nodup choices.
Because NODUP compares all the variables in our dataset but NODUPKEY just compares the BY variables`, NODUP is different from NODUPKEY.
79. Mention the SAS validation tools that are employed.
For the DataSet Debug: Data set name named set: stmtchk
To use macros: Options: mlogic, mprint, and symbolgen
80. What exactly does Proc glm do?
An analysis of covariance, a multivariate analysis of variance, and a repeated measure analysis of variance can all be performed with the help of the procedure glm.
81. What exactly are SAS informats?
The SAS INFORMATS module allows for the reading and input of data from a wide range of external files. These include flat files, ASCII files, text files, and sequential files. The data will be read into SAS variables using the information provided.
82. What exactly is Linear Regression?
Linear regression is a statistical approach that predicts the value of one variable Y based on the value of another variable X. X is known as the predictor variable, while Y is known as the criteria variable.
83. What do you mean when you say “Normal Distribution”?
Data is often dispersed in various ways, with a bias to the left or right, or it might all be mixed up. There is a chance, however, that the data is dispersed about a central value with no bias to the left or right and achieves normal distribution in the form of a bell-shaped curve. A symmetrical bell-shaped curve is used to disperse the random variables.
84. What is the category in which SAS Informats is classified?
SAS data is divided into three types.
- Character Informats: $INFORMATw
- Date/Time Informats: INFORMAT w.
- Numeric Informats : INFORMAT w.d
85. What is the purpose of the CATX syntax?
Syntax concatenates character strings by removing leading and trailing blanks and inserting separators.
86. Can a variable that solely includes integers be a character data type?
Yes, depending on how you utilize the variable. Some numbers will be used as a categorical value rather than a quantity. An example of this would be the “Foreigner” variable, where the observations would have values “0” and “1”, respectively, denoting foreigners and not foreigners. Similarly, the ID of a certain table might be a number but does not reflect any amount. Another common example is phone numbers.
87. Can a variable be of the numeric data type if it contains letters or special characters?
No, it must be a character data type.
88. What is the maximum size of a dataset in SAS?
SAS data sets may include up to 32,767 variables prior to SAS 9.1. The maximum number of variables in SAS data collection in SAS 9.1 is restricted by the computer’s resources.
89. What are your primary interests in Sas?
ETSBriefly, BASE, STAT, GRAPH
90. How do you test and debug SAS programs?
The first step is to check the log for errors, warnings, or NOTEs in certain circumstances, or to utilize the debugger in the SAS data stage.
91. What Sas versions have you used (and on what platforms)?
SAS 9.1.3, 9.0, and 8.2 on Windows and UNIX, as well as SAS 7 and 6.12.
92. What issues might you face when processing missing values? In the data steps? Arithmetic? Comparisons? Functions? Data classification?
Any operation with a missing value will produce a missing value. Missing variable values are excluded from analysis in most SAS statistical procedures.
93. What’s the difference between functions and procedures that do the same simple descriptive statistics calculation?
Proc may be used with a broader scope, and the results can be transferred to another dataset. Functions often have an impact on existing datasets.
94. If you were asked to create multiple records from a single record, how would you do it using an array and proc transpose?
Do loop Proc Transpose with VAR using the array of variables in the record.
95. What is the difference between numeric and character?
All numeric and character variables in the dataset will be read or written.
96. How would you make a data set with 30 variables and 1 observation from a data set with 30 variables and 1 observation?
PROC TRANSPOSE is used.
97. What is the distinction between INFORMAT and FORMAT?
INFORMAT: To tell SAS that a number should be read in a certain format.
FORMAT: Indicates how SAS should print the variables.
98. Mention how you read the variables you require.
The variables are read using an input statement with column/line pointers, informats, and length specifiers.
99. What exactly is factor analysis?
Factor analysis is a general name for a group of statistical approaches that involve reducing a collection of observable variables to a small number of hidden components. The primary purpose of factor analysis is to reduce and summarise data.
100. Mention which special input delimiters are used in SAS.
DLM and DSD are special input delimiters used in SAS.
101. With no input data, how might you generate test data?
Data Null and the put statement are used.
102. When debugging, what can you learn from the SAS log?
It will show the execution of the entire program as well as the reasoning. It will also display the error along with the line number, allowing you to edit the program.
103. What is the Meaning Of _Error_?
It only has two values: 1 for error and 0 for no error.
104. Is SAS capable of handling missing values in assignment statements, a merge, functions, an update, sort order, formats, and procedures?
Missing values will be noted as such in the Assignment statement. Missing is sorted as the second smallest item, followed by an underscore.
105. How do you incorporate a “trace” into your program?
By turning on ODS TRACE.
106. How can missing values be tested?
Subset functions such as IF then Else, Where, and Select are used.
107. Internally, how are numeric and character-missing values represented?
Blank or Numeric as a character
108. Which of the following date functions advances a date time or date/time value by a given interval?
INTNX.
109. What is the first action in a typical data step in the flow of data step processing?
SAS processes the DATA step and then produces a new SAS data set when you submit it.
(input buffer and PDV generation)
Execution Phase Compilation Phase
110. SAS/access vs. SAS/connect: What’s the difference?
SAS/Access only works with databases such as Oracle, SQL-server, and Microsoft Access.
SAS/Connect solely makes advantage of the Server connection.
111. What Is the Benefit of Using the N=ps Option?
The N=PS option generates a buffer in memory big enough to contain PAGESIZE (PS) lines, allowing a page to be prepared arbitrarily before printing.
112. What Are Sas Scrubbing Procedures?
Proc Sort with the nodupkey option to remove duplicate values.
113. What differences did you see between SAS Versions 6-8 and 9?
SAS 9 Architecture is fundamentally different from any previous SAS version. SAS 9 architecture relies on a new component, the Metadata Server, to provide an information layer between applications.
114. What differences did you see between SAS Versions 6-8 and 9?
SAS 9 Architecture is substantially different from any previous SAS version. SAS relies on a new component, the Metadata Server, in the SAS 9 design to provide an information layer between the programs and the data they access. Metadata, such as security permissions for SAS libraries and the locations of the numerous SAS servers, are kept in a central repository.
115. What is your most common programming error?
Missing semicolon and failure to check the log after program submission Using debugging techniques and the Fsview option infrequently.
116. What is the difference between using the mean function to calculate the mean and using proc means?
By default, Proc Means computes summary statistics such as N, Mean, Standard Deviation, Minimum and Maximum, whereas the Mean function just computes mean values.
117. In the Merge Statement, which data set is the controlling data set?
The data set with the fewest observations controls the data set in the merge statement.
118. Do you like Proc Report or Proc Tabulate better? Why?
I like to utilize Proc report till I need to generate cross-tabulation tables since it provides me with so many possibilities for modifying the look-up of my table, but Proc tabulate is unable to produce some of the items in my table.
119. What are the options for input and output datasets?
Options for the input data set are obs, first jobs, and compress, while reuse is an option for the output data set.
The options for the input and output datasets are kept, drop, rename, obs, and first obs.
120. How does the enterprise guide work? What purpose does it serve?
There is a method for importing text files into SAS.
121. How Are Zero Observation Datasets Made?
Utilizing the like clause to create a data collection.
The following example: proc sql;create table latha.emp like oracle.emp;quit;
A similar phrase in this case causes the structure of the old table to be copied over to the new table. The generation of an empty table is the outcome of applying this procedure.
In the editor window, we type %include “path of the SAS file”; run; if the environment isn’t one that supports windows, there’s no need to include the run statement.
122. How Can a.csv File Be Imported Into SAS?
Open notepad and specify the variables before creating the CSV file.
123. What Function Does Proc SQL Serve?
SAS’s PROC SQL function, which combines the capabilities of data and procs steps, is a potent tool. PROC SQL can combine datasets into new variables, display the results, sort, summarise, subset, join (merge), and concatenate them all in one operation! Compared to data and proc steps, PROC SQL consumes fewer resources. It is not necessary to arrange the data before merging in PROC SQL in order to combine files; data merging is required.
124. Sas Graph: What Is It?
The SAS/GRAPH program develops and delivers precise, high-impact visualizations that help decision-makers comprehend important business challenges quickly.
125. Specify the steps to include or exclude particular variables from a data set.
DROP, KEEP Statements, and Data Collection O
126. What constitutes the SAS base program’s fundamental structure?
The two steps that makeup SAS’s fundamental architecture are ==DATA, which retrieves and manipulates data, and ==PROC, which interprets the data.
127. Describe how to utilize PROC GPLOT.
The data collection containing the plot variables is identified by PROC GPLOT. It offers additional possibilities, which enables it to produce visuals that are more vibrant and elaborate.
128. How do You order things in decreasing order?
We may sort in descending order by using the DESCENDING keyword in the PROC SORT code.
129. What is the input of the function and put used for?
The input function transforms character inputs into numerical values.
Put function: Character values are created from numerical values.
130. What do you mean by the sum function and “+” operator?
The “sum” function or the “+” operator are both used in SAS to achieve addition or summation. In contrast to the “+” operator, which returns a missing value when one or more arguments are absent or missing, the function “Sum” returns the sum of arguments that are present (non-missing arguments).
131. What purpose does the DIVIDE function serve?
The division outcome is returned using the DIVIDE function.
132. How does the CALL PRXFREE Routine work?
Character string matching and the allocation of free memory for Perl regular expressions are done using the CALL PRXFREE routine.
133. Describe how ANYDIGIT works.
The ANYDIGIT function is used to look for a digit’s (or number’s) first appearance in a string. The digit’s position is returned. It returns a “0” if no digit is discovered. The ANYDIGIT function allows the search to start anywhere in the string by providing an optional argument.
134. Explanation of BY-group Processing
The BY-group processing makes use of the BY statement to process data that has been indexed, grouped, or ordered according to variables.
135. What would be the value of month at the end of data step execution and how many observations would be there?
The value of month would be 13
No. of observations would be 1
136. What does the CALL MISSING Routine mean to you?
The CALL MISSING procedure can be used to assign missing values to the given character or numeric variables.
137. PDV (Program Data Vector) – what is it?
Program data vectors are logical sections of memory where SAS creates data sets, one observation at a time (PDVs). When a program is run, SAS typically receives data values from the input buffer or creates them using SAS language instructions, assigning these data values to particular or appropriate variables in the program data vector. Two automated variables, the _N_, and ERROR variables are also included in the program data vector.
138. How does PROC COMPARE handle variable formats?
PROC COMPARE, which is used to compare unformatted values, can handle variable formats.
139. What do you mean by the option ALTER= Data Set?
Assigns an ALTER= password to a SAS file, preventing users from replacing or deleting the file and allowing access to a read-only or write-only file.
140. What function does SAS’ Retain serve?
The missing values of variables are assigned either by an INPUT statement or through an assignment statement within the data step, and SAS reads the data statement at the beginning of each iteration of the data step before adding them to the program’s data vector (logical areas of memory). This default is overridden with RETAIN statements. SAS does not change variables to missing when switching between iterations of a data step when a RETAIN statement is used. Rather than keeping the variables, they are retained.
141. How do dates work in SAS data?
Data is central to every data set. In SAS, data is available in tabular form where variables occupy the column space and observations occupy the row space.
- SAS treats numbers as numeric data and everything else falls under character data. Hence SAS has two data types numeric and character.
- Apart from these, dates in SAS are represented in a special way compared to other languages.
Figure:SAS Interview Questions – SAS Dates
- A SAS date is a numeric value equal to the number of days since January 1, 1960.
- Apart from Date Values, there are many tools to work on dates such as informats for reading dates, functions for manipulating dates, and formats for printing dates.
142. List down the reasons for choosing SAS over other data analytics tools.
We will compare SAS with the popular alternatives in the market based on the following aspects:
Ease of Learning | SAS is easy to learn and provides an easy option (PROC SQL) for people who already know SQL. |
Data Handling Capabilities | SAS is on par with all leading tools including R & Python when it comes to handling huge amounts of data and options for parallel computations. |
Graphical Capabilities | SAS provides functional graphical capabilities and with a little bit of learning, it is possible to customize these plots. |
Advancements in Tool | SAS releases updates in a controlled environment, hence they are well-tested. R & Python, on the other hand, have an open contribution and there are chances of errors in the latest developments. |
Job Scenario | Globally, SAS is the market leader in available corporate jobs. In India, SAS controls about 70% of the data analytics market share |
SAS over data analytics tools
143. What is interleaving in SAS?
Interleaving combines individual, sorted SAS data sets into one sorted SAS data set. For each observation, the following figure shows the value of the variable by which the data sets are sorted. You interleave data sets using a SET statement along with a BY statement.
In the following example, the data sets are sorted by the variable Year.
we can sort and then join the datasets on Year with the below code.
1234 | data combined; set data1 data2; by Year; run; |
144. Explain the tranwrd function.
The TRANWRD function discards or replaces any occurrence of a substring.
145. State the difference between DO WHILE and DO UNTIL.
The DO WHILE expression is gauged at the top of the DO LOOP, and if the expression is false at the first time of evaluation, then the DO LOOP never executes. DO UNTIL, on the contrary, executes at least once.
146. What is the CROSS LIST option in the TABLES statement?
The addition of the CROSSLIST option to the TABLES statement shows crosstabulation tables in ODS column format.
147. Explain the function of the output statement in a SAS program.
The output statement helps in saving summary statistics in a SAS data set for creating customized reports or saving past information about a process.
The output statement can be used for the following:
- Stating the statistics to be saved in the output data set
- Stating the name of the output data set
- Computing and saving the percentile is not automatically computed by the CAPABILITY process.
148. Given an unsorted data set, how to read the last observation to a new data set?
We can read the last observation to a new data set using end= data set the the
option.
For example:
data work.calculus; set work.comp end=last; If last; run;
Where calculus is a new data set to be created and comp is the existing data set. last is the temporary variable (initialized to 0) which is set to 1 when the set statement reads the last observation.