Data Warehouse Final

Description

Final
Kyle Genetin
Flashcards by Kyle Genetin, updated more than 1 year ago
Kyle Genetin
Created by Kyle Genetin over 2 years ago
9
0

Resource summary

Question Answer
Inmon advocated building the data warehouse piece by piece as needed True or False False
We need to "reform" dimensions so they can be used by >1 Fact table True or False False
A project plan for an enterprise data warehouse may look like this? False
What would be a typical ordering of steps for a dimensional approach to DW development? Data mart, business requirements, conformance Matrix Business requirements, conformance Matrix, Data mart Business requirements, enterprise data warehouse, data mart Business requirements, Data mart, conformance Matrix Business requirements, conformance Matrix, Data mart
The artifact shown here is a conformance matrix? True or False True
What is the flaw in this artifact X needed for Professor-Tuition X not need for Professor-Graduation Columns for other dimensions are missing There are no flaws Columns for other dimensions are missing
These two dimensions are not conformed True or False True
Inconsistencies like these could be reconciled automatically True or False? False
The solution to keeping our dimensions conformed is to use Master Data True or False True
In a conformed rollup, one dimension is more granular than the other True
In this diagram, Dim1 is the conformed rollup, and Dim2 is the base dimension True or False False
Stakeholders should sign off before developing a data mart without first planning conformed DIMs True or False True
This DIM can hold events like Justin Turner being sacked by the browns True
Which is true about how the change in flour was handled in this ProductDIM example? Type 3 SCD was Used Type 2 SCD was Used Type 1 SCD was Used None of the Above Type 1 SCD was Used
It is up to IT Professionals to decide whether SCD 1, 2, or 3 should be used True or False? False
Deciding how we want to keep track of history in a data warehouse called "Retrospection" True
Which is true about how the change in flour was handled in this ProductDIM example? Type 3 SCD was Used Type 2 SCD was Used Type 1 SCD was Used Not correct b/c violates 1st normal form Type 2 SCD was Used
Type 3 SCD is useful for viewing old data through a new lens or vice versa True or False True
If we do not have a natural key, we cannot implement SCD Type 2 True or False False
This example illustrates that SCD 1 generates a lot of new records SCD 2 generates a lot of new records SCD 3 generates a lot of new records Hal Smith should not have been married SCD 2 generates a lot of new records
There is just one instantiation of the entity in this example True or False True
If we implement a Type 2 SCD for Manager, what will happen? We will overwrite CustomerAddress We will change the ManagerKEY We will add a column for prior manager We will issue numerous new records for our customers We will issue numerous new records for our customers
At any given point in time, some records in a data warehouse will be "out of date" True or False True
Which is true about how the change in flour was handled in this ProductDIM example? Type 3 SCD was Used Type 2 SCD was Used Type 1 SCD was Used Not correct b/c violates 1st normal form Type 3 SCD was Used
How can we, as IT people, better ensure high quality data? All the other choices Set up communications between data creators and data users Persuade management to spend $ to fix data at source Give responsibility for data cleanliness to business owners All the other choices
Manual entry of website data remains the biggest threat to data quality True or False False
Which ETL principle ensures that the same types of records are always treated the same way? 1st Normal Form Reliability Availability Manageability Reliability
Data cubes have a lot of data redundancy to save space True or False False
Which ETL principle ensures that the Data Warehouse can evolve gracefully Type 2 SCD Reliability Availability Manageability Manageability
The function RandInt(n) would be useful for generating the necessary surrogate keys True or False False
When loading a star, we should load the Fact Table first? True or False False
Which ETL principle ensures that Service Level Agreement are met? 3rd Normal Form Reliability Availability Manageability Availability
Inspection of log files is one way to determine when SCD 2 change need to be made True or False True
What is the grain of this FACT table? Date Date and Product Store and Product Date, Product, and Store Date, Product, and Store
In which of our problem sets have we created Transaction Fact Tables? PS2 & 3 - Shoplifting Incidents PS7 - Sales and Targets PS$ - Emergency Room Check In All the other choices All the other choices
What type of FACT table is this? Event FACTLESS Transaction FACT Coverage FACTLESS Periodic Snapshot FACT Event FACTLESS
This is a Periodic Snapshot Table True or False True
An accumulating snapshot FACT table tracks what has happened over a period of time True or False True
Select AS Many of the following that are correct Accumulating snapshot tables use "FAT" data Periodic snapshot tables use "Thin" data Transaction FACT tables use "THIN" data Accumulating snapshot tables use "THIN" data Accumulating snapshot tables use "FAT" data Periodic snapshot tables use "Thin" data Transaction FACT tables use "THIN" data
Accumulating Snapshot FACT tables typically are snowflaked True or False False
Accumulating Snapshot Fact Tables typically use Dimensional Role Play True or False True
A periodic snapshot FACT table captures One row per transaction One row for the entire lifetime of an event One row per time period None of the other choices One row per time period
A Prof. that needs to understand student video watching at various points in the semester Should use a Periodic Snapshot FACT table Should use a transaction FACT table Should use an Accumulating Snapshot FACT table Should get out of the teaching business Should use a Periodic Snapshot FACT table
Which is true about how the change in flour was handled in this example? Type 3 SCD was used Type 2 SCD was used Type 1 SCD was used There cannot be columns for Old and New without violating 3rd normal form Type 3 SCD was used
A harm that may come from the way firms process PII is that customers may be "outed" as something they either are not, or do not want revealed True or False True
A Periodic Snapshot FACT Table stores the state of a business at a regular time interval True or False True
Accumulating snapshot tables use "FAT" data True or False True
Without a natural (business, persistent) key in a Dimension, we cannot implement SCD Type 2 True or False True
There is just one instantiation of the entity in this example True or False True
Data Profiling would be a way to find out that regions did not conform in Problem Set 6 True or False True
Stakeholders should make sure that conforming DIMs have been planned before they sign off on (approve) a data mart project True or False True
These two dimensions are not conformed True or False True
You may recall the star schema we used in Problem Set 7. What kind of FACT table is at the center of this STAR? Periodic Snapshot FACT Table Conformed Snapshot FACT Table Accumulating Snapshot FACT Table Transaction FACT Table Transaction FACT Table
Assume that there can be more than one promotion active for a product on any given day at any given store. One possible way of designing the FACT table would be as follows: Is this Correct? Yes, because we got rid of the KeyNo as the single PK No, because PromoDIM should be a causal DIM No, because it should use dimensional role playing for Promotions Yes, because it ensures the right granularity Yes, because it ensures the right granularity
Second Portion Question 1 Over the past few years the owners of this jewelry business received communications from some of their customers who have told them that their product selection does not cater well to people who identify in a non-traditional manner (i.e. non-binary, trans, etc.). Having little personal experience with this, the owners want to make sure they can study the impact it may have on sales, especially for people who first identify one way, but then end up identifying another. What method would you advise the owners to use to capture the needed information? SCD 3 on the CustomerDIM SCD 2 on the CustomerDIM SCD 1 on the CustomerDIM Dimensional Role Play with the Customer DIM SCD 2 on the CustomerDIM
Second Portion Question 2 This is the DIM they came up with It fully implements the solution you chose in the prior question True or False False It should be implementing an SCD 2 solution, which also requires a Natural Key. It seems to have a start and end date for the entity, but no time stamp and current flag for an entity such as Gender Identification.
Second Portion Question 3 Assume that the owners do need an SCD 2 solution (I am not saying this is correct). If that is the case, what do they need to change? a) Add a Natural Key b) Add a Surrogate Key c) Add a Date Stamp d) Add a Current Flag e) A, C and D f) B, C, and D e) A, C and D
Second Portion Question 4 Assume that the owners have implemented a Type 2 SCD solution for the CustDIM. What problems might they encounter? They will have to issue new surrogate keys every time a customer buys a different product Multiple versions of the same Cust name could cause him to produce a Cartesian Product when joining the Dimension and the Fact table None of the choices here will be a problem They will have to use SELECT COUNT (DISTINCT column-name) when using SQL to analyze how many Customers they have They will have to use SELECT COUNT (DISTINCT column-name) when using SQL to analyze how many Customers they have
Second Portion Question 5 Another reason the owners want to keep track of this information is because of the possibility that lawsuits might be brought regarding certain products that they sell. Being an enterprise that taps into a wide variety of artists, they sell collections like the "Bad Boy Collection" that might provoke sufficient offense as to lead to a case against them. Or there may be defects with earrings or other pieces that could cause physical injury. They are headquartered in Tennessee, and are aware that the Tennessee courts adopted a star schema system based on Microsoft SQL Server 2005. This is the relevant star they can connect to to find out if any cases have been filled regarding the products they carry: What kind of FACT table is present in this star? Accumulating Snapshot FACT Table Product Bridge FACT Table Periodic Snapshot FACT Table Event Factless Fact Table Event Factless Fact Table
Second Portion Question 6 Compare the Product dimensions in each of the two stars (the original Sales star and the new courts star). Pick as many of the following statements as you think are correct: A. The two dimensions are not conformed because the fields are not identical. B. The two dimensions are not conformed because the presence of a Bridge table makes the relationship between the two stars uncertain. C. The two dimensions are conformed using a conformed roll-up. D. The two dimensions are not conformed because the granularity in the Disposition FACT table is not correct. E. The two dimensions are conformed because they each have a Product Category. F. The two dimensions are not conformed because the contents of the fields are not identical. G. The two dimensions are conformed because they are identical. A. The two dimensions are not conformed because the fields are not identical. F. The two dimensions are not conformed because the contents of the fields are not identical.
Second Portion Question 7 Assume there is a conformance problem with the ProductDIM. TF: This vest way to address this problem is by using a Conformance Matrix True False False This question relies on remembering that the Jewelry Store owners have almost certainly have no influence over how the Tennessee Court system is storing its data. Fortunately, Tennessee is using the Universal Product Code, which is a Natural Key fvor the Products. As long as the Jewelry Store can characterize its products with the UPC, it can bring about conformance. A Conformance Matrix alerts designers and developers to which Dimensions will be used in different business processes, but that going to have to be within the same firm or partner firms.
Second Portion Question 8 The owners of the Jewelry business now want to check to see which, if any, of their products may have been involved in Tennessee court cases. Assuming that any conformance issues with the ProductDIM have been resolved (if there are any), on what dimension(s) should the owners drill across? ProductDIM-SalesFACT Star to ProductDIM DispositionFACT Star Both ProductDIM-SalesFACT Star to ProductDIM DispositionFACT Star and DateDIM-SalesFACT Star to DateDIM DispositionFACT Star The bridge table in the DispositionFACT Star makes it impossible to drill across Sales FACT to DispositionFACT DateDIM-SalesFACT Star to DateDIM DispositionFACT Star ProductDIM-SalesFACT Star to ProductDIM DispositionFACT Star ProductDIM-Sales Facet Star to ProductDIM DispositionFACT Star. WSe do not know when a court case may have been ruled on, i.e. what is the DispositionDATE of the case. Although we have a DateDIM in that star, its meaning is very different. For the SalesFACT star, the date is the date of a sale. So if we were to match date of sale to date of disposition, we would almost certainly get nothing, and thereby miss cases. The only match that makes sense is on the products themselves.
Second Portion Question 9 The Tennessee Court system has noticed that some of the records in the Star Schema for disposition of cases are missing data for some fields. This is hindering its ability to do important analysis. What would be the best single thing to do (if you can only do one of these) to ensure the data is complete and accurate? Make sure that the business side has responsibility for data quality Hire extra court stenographers to take down all the information Set up IT systems that will reject any data that does not fit the Master Data files Make the lawyers on each side responsible for entering their own information through a portal Make sure that the business side has responsibility for data quality Making the lawyers responsible is no guarantee they will do it well. And how do we know that's all that is needed? Hiring extra stenographers would be expensive, and again, they are not necessarily seeing the whole process (small partial credit). We do like the idea of IT systems that reject data that cannot have referential integrity with Master Data, so that's partial credit. But the problem seems to be missing data, which may come because it does not match, but may just be because no one is making sure it is being entered. So the only choice here that is comprehensive is to make the business side, i.e. the lawyers and courts, responsible for ensuring data quality. They the next stage of necessary measures can be defined and they can be held accountable.
Show full summary Hide full summary

Similar

Microbiology MCQs 3rd Year Final- PMU
Med Student
Anatomy and Physiology
pressey_property
Science Final Study Guide
Caroline Conlan
APUSH End-of-Year Cram Exam: Set 1
Nathaniel Rodriguez
sec + final
maxwell3254
CCNA Security Final Exam
Maikel Degrande
CCNP TShoot Final
palciny
Macroeconomics Final
Koda M
Architecture
lindsey.reynolds
audio electronics
Lillian Mehler
Biochemistry Final Review
Kaitlyn Emily Bi