Exam Code: 70 475 exam (Practice Exam Latest Test Questions VCE PDF)
Exam Name: Designing and Implementing Big Data Analytics Solutions
Certification Provider: Microsoft
Free Today! Guaranteed Training- Pass exam 70 475 Exam.
Q11. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
Your company has multiple databases that contain millions of sales transactions. You plan to implement a data mining solution to identity purchasing fraud.
You need to design a solution that mines 10 terabytes (TB) of sales data. The solution must meet the following requirements:
• Run the analysis to identify fraud once per week.
• Continue to receive new sales transactions while the analysis runs.
• Be able to stop computing services when the analysis is NOT running.
Solution: You create a Cloudera Hadoop cluster on Microsoft Azure virtual machines. Does this meet the goal?
A. Yes
B. No
Answer: A
Q12. HOTSPOT
Your company has 2000 servers.
You plan to aggregate all of the log files from the servers in a central repository that uses Microsoft Azure HDInsight. Each log file contains approximately one million records. All of the files use the .log file name extension.
The following is a sample of the entries in the log files.
2021-02-03 20:26:41 SampleClass3 (ERROR) verbose detail for id 1527353937
In Apache Hive, you need to create a data definition and a query capturing tire number of records that have an error level of [ERROR].
What should you do? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Answer:
Q13. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
Your company has multiple databases that contain millions of sales transactions. You plan to implement a data mining solution to identity purchasing fraud.
You need to design a solution that mines 10 terabytes (TB) of sales data. The solution must meet the following requirements:
• Run the analysis to identify fraud once per week.
• Continue to receive new sales transactions while the analysis runs.
• Be able to stop computing services when the analysis is NOT running.
Solution: You create a Cloudera Hadoop cluster on Microsoft Azure virtual machines. Does this meet the goal?
A. Yes
B. No
Answer: A
Q14. HOTSPOT
Your company has 2000 servers.
You plan to aggregate all of the log files from the servers in a central repository that uses Microsoft Azure HDInsight. Each log file contains approximately one million records. All of the files use the .log file name extension.
The following is a sample of the entries in the log files.
2021-02-03 20:26:41 SampleClass3 (ERROR) verbose detail for id 1527353937
In Apache Hive, you need to create a data definition and a query capturing tire number of records that have an error level of [ERROR].
What should you do? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Answer:
Q15. DRAG DROP
You need to implement rls_table1.
Which code should you execute? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Answer:
Topic 2, Mix Questions
76. You have structured data that resides in Microsoft Azure Blob Storage.
You need to perform a rapid interactive analysis of the data and to generate visualizations
of the data.
What is the best type of Azure HDInsight cluster to use to achieve the goal? More than one answer choice may achieve the goal. Select the BEST answer.
A. Apache Storm
B. Apache HBase
C. Apache Hadoop
D. Apache Spark
Answer: C
Q16. DRAG DROP
You have data generated by sensors. The data is sent to Microsoft Azure Event Hubs.
You need to have an aggregated view of the data in near real-time by using five minute tumbling windows to identity short-term trends. You must also have hourly and a daily aggregated views of the data.
Which technology should you use for each task? To answer, drag the appropriate technologies to the correct tasks. Each technology may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Answer:
Q17. Your company has thousands of Internet-connected sensors.
You need to recommend a computing solution to perform a real-time analysis of the data generated by the sensors.
Which computing solution should you include in the recommendation?
A. Microsoft Azure Stream Analytics
B. Microsoft Azure Notification Hubs
C. Microsoft Azure Cognitive Services
D. a Microsoft Azure HDInsight HBase cluster
Answer: A
Q18. HOTSPOT
You implement DB2.
You need to configure the tables in DB2 to host the data from DB1. The solution must meet the requirements for DB2.
Which type of table and history table storage should you use for the table? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one pint.
Answer:
Q19. Which technology should you recommend to meet the technical requirement for analyzing
A. Azure Stream Analytics
B. Azure Data Lake Analytics
C. Azure Machine Learning
D. Azure HDInsight Storm clusters
Answer: A
Q20. You have a Microsoft Azure Data Factory pipeline.
You discover that the pipeline fails to execute because data is missing. You need to rerun the failure in the pipeline.
Which cmdlet should you use?
A. Set-AzureAutomationJob
B. Resume-AzureDataFactoryPipeline
C. Resume-AzureAutomationJob
D. Set-AzureDataFactotySliceStatus
Answer: B