What is a possible pitfall of utilizing excel as a way to manipulate small databases

1. What is a possible pitfall of utilizing Excel as a way to manipulate small databases?

  • Excel does not enforce many principles of relational data models.
  • Excel is a user program and thus cannot run on a server.
  • Excel does not allow algorithms for data manipulation.

2. What does the term “atomic” mean in the context of relational databases?

  • Fixed schema of a particular database.
  • A tuple that cannot be reduced.
  • A column or row of data. Depends on the context.
  • One unit of information that cannot be decomposed.

3. What is the Pareto-Optimality problem?

  • Find the shortest path from source node to target node.
  • Find the best possible path given two or more optimization criteria where neither constraint can be fully optimized simultaneously.
  • Find the optimal path that requires going through specific nodes given by the user.

4. What constitutes a community within a graph?

  • High density of nodes at a certain location.
  • A neighborhood defined by an integer constant K around a specific node. All K+1 nodes belong in another community.
  • A dense amount of edge connections between nodes in a community and a few connections across communities.
  • Many anomalous neighborhoods within the same vicinity.

5. Why are trees useful for semi-structured data such as XML and JSON?

  • Computers can easily visualize the data with a tree structure.
  • It is not always the case that XML and JSON can be represented as trees.
  • Trees take advantage of the parent-child relationship of the data for easy navigation.
  • They are only useful for XML data as tree-like structure is apparent with tags. While JSON does not contain a tree-like structure as it contains arrays.

6. What is the general purpose of modeling data as vectors?

  • Enables weighting of the query.
  • The ability to normalize vectors allowing probability distributions.
  • Enables image searching.
  • Results can be ordered by similarity using vector projection.

7. For the following questions 7, 8, and 9, suppose a registration website creates data with the following fields for each person registered (note: if the user does not input a value, NULL is stored instead): Name, Date, Address, and Account Number.

Suppose we collect data month by month. Each month, we would have a batch of data containing the fields listed above. At the end of the year, we want to summarize our registrant activities for the entire year, so we would remove redundancies in our data by removing any records with duplicate account numbers from month to month. What type of operation do we use in this scenario?

  • Join
  • Not an Operation
  • Subsetting
  • Union

8. From the information given in question 7, what are the constraints, if any, which we have placed on the Account Number field for the end of year collection?

  • Account should have at most n digits.
  • If we had n duplicate Account Numbers then we will remove n-1 duplicate fields.
  • There are no constraints.
  • Account Number should be unique.

9. Suppose 100 people signup for our system and of the 100 people, 60 of them did not input an address. The system lists the values as NULL for these empty entries in the address field. Would this situation still have structure for our data?

  • No because the majority of data do not have a specific field filled, thus our originally defined structure is lost.
  • Yes the data has structure because we have placed a structural constraint on the data, thus the data will always have the originally defined structure.

Big Data Integration and Processing complete course is currently being offered by UC San Diego through Coursera platform and is Course 2 of 6 in the Big Data Specialization.

About this Course: This course is for those new to data science. Completion of Intro. to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications.

Also Check: How to Apply for Coursera Financial Aid

introduction to big data with spark and hadoop coursera quiz answers big data modeling and management systems coursera answers security and privacy for big data part 1 quiz answers information integration - quiz machine learning with big data quiz 10 - check your analysis results answers analyzing big data with sql quiz answers check your analysis results quiz answers

Big Data Modeling and Management Systems Week 1 Quiz Answers!

Q1. (Questions 1-3 pertain to the video lecture “Exploring
the Relational Data Model of CSV”) What is the approximate population of La Paz
County in the state of Arizona for the CENSUS2010POP (column H)? (Choose the
best answer.)

  • 15000
  • 25000
  • 10000
  • 20000

Q2. What county in the state of Wyoming has the smallest
estimated population?

  • Platte
  • Uinta
  • Niobrara
  • Sweetwater

Q3. At 2:45 of the video, the Instructor creates a filter
for all of the counties in California with a population greater than 1,000,000.
However, included in the results is the entire state of California. This
anomalous value might skew our analysis if, for example, we wanted to compute
the average population of these results. What additional filter might work to
resolve this problem?

  • Add
    a filter to detect and remove results which do not include the word
    “County” in column G.
  • Add
    a filter which finds all counties with population greater than 100,000 AND
    less than 10,000,000 for column H (CENSUS2010POP).
  • Add
    a filter where the value in column E is greater than 1,000,000.
  • None
    of the above

Q4. (Questions 4 and 5 pertain to the video “Exploring
Sensor Data”) How often (in seconds) do the R5 measurements occur?

  • 60
  • 40
  • 50
  • 30

Q5. What is the field for rain accumulation?

  • Sm
  • Dn
  • Rc
  • Dx

Q6. (Questions 6 and 7 pertain to the video lecture
“Exploring the Array Data Model of an Image”) What is the (Red, Green, Blue)
pixel value for location 500, 2000?

  • (163,
    118, 79)
  • (134,
    145, 46)
  • (50,
    156, 182)
  • (100,
    123, 149)

Q7. Is this value likely to be land or ocean?

  • Land
  • Ocean

Q8. (Questions 8 and 9 pertain to the video lecture
“Exploring the Semistructured Data Model of JSON”) Given a tweet, what path
would you most likely enter to obtain a count of the number of followers for a
user?

  • user/followers_count
  • user/statuses_count
  • user/listed_count
  • None
    of the above

Q9. Which of the following fields are nested within the
‘entities’ field (select all that apply)?

  • tweets
  • user_mentions
  • events
  • views
  • symbols
  • urls

Big Data Modeling and Management Systems Week 2 Quiz Answers — Data Models Quiz Answers!

Q1. What is a possible pitfall of utilizing Excel as a way
to manipulate small databases?

  • Excel
    does not enforce many principles of relational data models.
  • Excel
    is a user program and thus cannot run on a server.
  • Excel
    does not allow algorithms for data manipulation.

Q2. What does the term “atomic” mean in the context of
relational databases?

  • Fixed
    schema of a particular database.
  • A
    tuple that cannot be reduced.
  • A
    column or row of data. Depends on the context.
  • One
    unit of information that cannot be decomposed.

Q3. What is the Pareto-Optimality problem?

  • Find
    the shortest path from source node to target node.
  • Find
    the best possible path given two or more optimization criteria where
    neither constraint can be fully optimized simultaneously.
  • Find
    the optimal path that requires going through specific nodes given by the
    user.

Q4. What constitutes a community within a graph?

  • High
    density of nodes at a certain location.
  • A
    neighborhood defined by an integer constant K around a specific node. All
    K+1 nodes belong in another community.
  • A
    dense amount of edge connections between nodes in a community and a few
    connections across communities.
  • Many
    anomalous neighborhoods within the same vicinity.

Q5. Why are trees useful for semi-structured data such as
XML and JSON?

  • Computers
    can easily visualize the data with a tree structure.
  • It
    is not always the case that XML and JSON can be represented as trees.
  • Trees
    take advantage of the parent-child relationship of the data for easy
    navigation.
  • They
    are only useful for XML data as tree-like structure is apparent with tags.
    While JSON does not contain a tree-like structure as it contains arrays.

Q6. What is the general purpose of modeling data as vectors?

  • Enables
    weighting of the query.
  • The
    ability to normalize vectors allowing probability distributions.
  • Enables
    image searching.
  • Results
    can be ordered by similarity using vector projection.

Q7. For the following questions 7, 8, and 9, suppose a
registration website creates data with the following fields for each person
registered (note: if the user does not input a value, NULL is stored instead):
Name, Date, Address, and Account Number.

Suppose we collect data month by month. Each month, we would
have a batch of data containing the fields listed above. At the end of the
year, we want to summarize our registrant activities for the entire year, so we
would remove redundancies in our data by removing any records with duplicate account
numbers from month to month. What type of operation do we use in this scenario?

  • Join
  • Not
    an Operation
  • Subsetting
  • Union

Q8. From the information given in question 7, what are the
constraints, if any, which we have placed on the Account Number field for the
end of year collection?

  • Account
    should have at most n digits.
  • If
    we had n duplicate Account Numbers then we will remove n-1 duplicate
    fields.
  • There
    are no constraints.
  • Account
    Number should be unique.

Q9. Suppose 100 people signup for our system and of the 100
people, 60 of them did not input an address. The system lists the values as
NULL for these empty entries in the address field. Would this situation still
have structure for our data?

  • No
    because the majority of data do not have a specific field filled, thus our
    originally defined structure is lost.
  • Yes
    the data has structure because we have placed a structural constraint on
    the data, thus the data will always have the originally defined structure.

Big Data Modeling and Management Systems Week 3 Quiz Answers: Data Formats and Streaming Data Quiz Answers!

Q1. What is true between data modeling and the formatting of
the data?

  • There
    is a one to one correspondence between formatting data and data modeling.
    For every model of data, there is only one way to store the data.
  • There
    is always one specific schema for storing model data that is the best and
    preferred method for the specific data representation.
  • The
    data does not necessarily need to be formatted in a way that represents
    the data model. Just so long as it can be extrapolated.

Q2. What is streaming?

  • Calculating
    results using real time data otherwise known as streaming data.
  • Using
    static data stored from a real time source in order to process and guide
    the application.
  • Utilizing
    real time data to compute and change the state of an application
    continuously.
  • Using
    sensors to manipulate the system, such as a smart car being able to drive
    by itself using sensors to detect road hazards.

Q3. Of the following, what best describes the properties of
working with streaming data?

  • Small
    time windows for working with data.
  • Data
    is always utilized for streaming the application.
  • Data
    manipulation is near real time.
  • Independent
    computations that do not rely on previous or future data.
  • Always
    unbounded in sequence, in other words, data is not guaranteed to be in
    order.
  • Does
    not ping the source interactively for a response upon receiving the data.

Q4. What is a characteristic of streaming data?

  • Data
    is unbounded in size but requires only finite time and space to process
    it.
  • The
    data is unbounded in size and the size determines the time and space of
    processing the data.
  • The
    data is finite and requires only finite time and space to process the
    data.
  • Data
    is finite in size and size determines the time and space of processing the
    data.

Q5. What type of algorithm is required for analyzing
streaming data?

  • Accurate
    and Consistent
  • Accurate
    and Memory Efficient
  • Fast
    and Complex
  • Fast
    and Simple

Q6. What is lambda architecture?

  • A
    specific method for processing streaming data using special real time
    processes.
  • A
    specific hardware architecture for a server made specifically for
    processing real time data.
  • A
    method to process streaming data by utilizing batch processing and real
    time processing.

Q7. Of the following, which best represents the challenge
regarding the size and frequency of data?

  • The
    size and frequency of the streaming data may be too small.
  • The
    size and frequency of the streaming data may be sporadic.
  • There
    may not be data to produce the notion of size and frequency.

Q8. What is the difference between data lakes and data
warehouses?

  • Data
    lakes house raw data while data warehouses contain pre-formatted data.
  • Data
    lakes contain only files while data warehouses contain only databases.
  • Data
    lakes utilize hierarchical systems while data warehouses use object
    storage.

Q9. What is schema-on-read?

  • The
    process where formatted data is given structure when read.
  • Another
    name for data lakes.
  • Data
    is stored as raw data until it is read by an application where the
    application assigns structure.
  • The
    process where data is pre-formatted prior to being read but the schema is
    loaded on read.

Big Data Modeling and Management Systems Week 4 Quiz Answers:

BDMS Quiz Answers!

Q1. The desired characteristics of a BDMS include (select
all that apply):

  • Narrow
    range of query sizes
  • Continuous
    data ingestion
  • Support
    for common “Big Data” data types
  • Support
    for ACID
  • A
    full query language
  • A
    flexible semi-structured data model

Q2. Fill in the blank with the best answer: CAP theorem
states that _________ all at once within a distributed computer system?

  • it
    is impossible to have consistency, accuracy, and partial tolerance
  • it
    is necessary to have consistency, accuracy, and partial tolerance
  • it
    is necessary to have consistency, availability, and partition tolerance
  • it
    is impossible to have consistency, availability, and partition tolerance

Q3. What is the purpose of the acronym BASE?

  • The
    same as ACID.
  • To
    overcome CAP theorem.
  • To
    impose properties on a BDMS in order to guarantee certain results.
  • Enables
    stricter enforcement of ACID type design.

Q4. What are ziplists in Redis?

  • A
    special type of data type that can store up to 512 mb of image data.
  • A
    look up table that is stored as a value in the database. Look up table
    points to actual values in memory.
  • A
    compressed list that is stored within the value of the database.
  • A
    special type of data type that can store hashes that point to multiple
    attributes.

Q5. What is one of the main features of Aerospike?

  • Images
    as values within the database.
  • Enables
    real time data streaming from external sources.
  • Support
    for geospatial data storage and geospatial queries.
  • Better
    equipped for string based search applications.

Q6. What database would be best suited for the following
scenario: An app development company is trying to implement a cloud based
storage system for their new map-based app. The cloud will manage the longitude
and latitude of the data in order to track user location.

  • Solr
  • Vertica
  • Aerospike
  • Redis

Q7. What database would be best suited for the following
scenario: A big wholesale company is trying to implement a search engine for
their products.

  • Redis
  • Aerospike
  • Solr
  • Vertica

Q8. Which of the following data types are supported by
Redis? (select all that apply)

  • Sorted
    Sets
  • Images
  • Hashes
  • Lists
  • Streaming
    Video
  • Strings

Example: What is CanCollide in roblox?

-- CanCollide is command a that will be able to make a Part collide or not

-- To make an object not fall off a part 
script.Parent.CanCollide = true

-- To make an object fall off a part
script.Parent.CanCollide = false

Tags:

Lua Example

Related

Introduction to big data

Test 1 why big data and where did it come from?

1. Which of the following is an example of big data utilized in action today?

a. The internet

b. Individual,unconnected hospital databases

c. Social media

d. Wifi networks

2. What reasoning was given for the following: why is the data storage to price ratio relevant to big data?

a. Companies cant afford to own, maintain and spend the energy to support large data storage unless the cost is sufficiently low.

b. It isnt, it was just an arbitrary example of big data usage.

c. Larger storage means easier accessibility to big data for every user because it allows users to download in bulk.

d. Lower prices mean larger storage becomes easier to access for everyone, creating bigger amounts of data for client

facing services to work with.

3. What is the best description of personalized marketing enabled by big data?

a. Being able to use personalized data from every single customer for personalized marketing needs.

b. Marketing to each customer on an individual level and suiting to their needs.

c. Being able to obtain and use customer information for groups of consumers and utilize them for marketing needs.

4. Of the following, which are some examples of personalized marketing related to big data?

a. Facebook revealing posts that cater towards similar interests.

b. A survey that asks your age and markets to you a specific brand.

c. News outlets gathering information from the internet in order to report them to the public.

5. What is the workflow for working with big data?

a. Big data>Better models> Higher precision

b. Theory>Models> Precise advice

c. Extrapolation> Understanding> Reproducing

6. Which is the most compelling reason why mobile advertising is related to big data?

a. Since most everyone owns a cell/mobile phone, the mobile advertising market is large and thus requires bug data to contain

all the information.

b. Mobile advertising benefits from data integration with location which requires big data.

c. Mobile advertising in and of itself is always associated with big data.

d. Mobile advertising allows massive cellular/mobile texting to a wide audience, thus providing large amounts of data.

7. What are the three types of diverse data sources?

a. Machine data, Map data, and Social Media

b. Information Networks, Map Data, and People

c. Machine Data, Organizational Data, and People

d. Sensor Data, Organizational Data, and Social Media

8. What is an example of machine data?

a. Sorted data from Amazon regarding customer info.

b. Social media

c. Weather station sensor output.

9. What is an example of an organizational data?

a. Social media

b. Disease data from Center for Disease Control

c. Satellite data

10.Of the three data sources, which is the hardest to implement and streamline into a model?

Enterprise Headless CMS and Knowledge Management

Content Infrastructure for the Connected World

CONTENT IS DATA

TerminusCMS is an open-source headless content and knowledge management system. A dev-first enterprise knowledge graph to break down departmental knowledge silos. Incorporate content with operational and transactional data to discover and use organization-wide knowledge for your front-of-house and back-office front ends. 

An organization-wide knowledge graph with the analytical power to unlock enterprise potential

Build a semantically connected content and knowledge model to curate cross-divisional data, content, and documentation. Data is stored as machine-readable JSON documents which are exposed as GraphQL and Datalog APIs for schema, query, and updates. 

Demo

Admin UI

Model Schema

GraphQL API

Change Requests

Demo

Admin UI

Model Schema

GraphQL API

Change Requests

KNOWLEDGE & CONTENT MANAGEMENT

Back Office, Apps, Portals, Websites, & Analytics

TerminusCMS is an enterprise knowledge graph to make content, knowledge, and data discoverable and usable.

Greater Query Power

Graph queries leveraging semantic relationships and analytics engine powered by GraphQL & Datalog.

Schema as Code

Flexible and extendable JSON schema syntax to model semantically enriched content models with code.

Provenance & Version Control

Immutable data provides Git-like features such as branch, rebase, clone, rollback, and time-travel. 

Change Request Workflows

Change request workflows built into the data layer to provide approval processes and security.

Interoperable Standards

Using JSON & RDF standards ensures interoperability across applications and devices.

100% Open Source

Choose a package that works for you. Self-host with our open-source install, or choose a hosted version, including dedicated compute resources.

USE CASES

TerminusDB and TerminusX dashboard

TerminusDB and TerminusX dashboard

Document properties,
ID, relationships,
key strategy, and
JSON view

Create teams &
data products to
work collaboratively.

Visual schema
builder & validator

Database admin &
query playground
tools

TERMINUSDB — THE DATABASE

An in-memory, distributed, and open-source document graph database for people who want the convenience of documents with the query power of graph relationships. For people who want data to be the star of their builds.

TerminusCMS Logo White

TerminusDB SOC2 Google Partner Microsoft partner

So much more than CMS

Get started in minutes and for free with our TerminusCMS Community Package. Clone an example from the dashboard to experiment and play today. 

Понравилась статья? Поделить с друзьями:
  • What is a possessive word form
  • What is a microsoft office word template
  • What is a portmanteau word
  • What is a microsoft excel test
  • What is a portable word processor