What is a possible pitfall of utilizing excel as a way to manipulate small databases

1. What is a possible pitfall of utilizing Excel as a way to manipulate small databases?

Excel does not enforce many principles of relational data models.
Excel is a user program and thus cannot run on a server.
Excel does not allow algorithms for data manipulation.

2. What does the term “atomic” mean in the context of relational databases?

Fixed schema of a particular database.
A tuple that cannot be reduced.
A column or row of data. Depends on the context.
One unit of information that cannot be decomposed.

3. What is the Pareto-Optimality problem?

Find the shortest path from source node to target node.
Find the best possible path given two or more optimization criteria where neither constraint can be fully optimized simultaneously.
Find the optimal path that requires going through specific nodes given by the user.

4. What constitutes a community within a graph?

High density of nodes at a certain location.
A neighborhood defined by an integer constant K around a specific node. All K+1 nodes belong in another community.
A dense amount of edge connections between nodes in a community and a few connections across communities.
Many anomalous neighborhoods within the same vicinity.

5. Why are trees useful for semi-structured data such as XML and JSON?

Computers can easily visualize the data with a tree structure.
It is not always the case that XML and JSON can be represented as trees.
Trees take advantage of the parent-child relationship of the data for easy navigation.
They are only useful for XML data as tree-like structure is apparent with tags. While JSON does not contain a tree-like structure as it contains arrays.

6. What is the general purpose of modeling data as vectors?

Enables weighting of the query.
The ability to normalize vectors allowing probability distributions.
Enables image searching.
Results can be ordered by similarity using vector projection.

7. For the following questions 7, 8, and 9, suppose a registration website creates data with the following fields for each person registered (note: if the user does not input a value, NULL is stored instead): Name, Date, Address, and Account Number.

Suppose we collect data month by month. Each month, we would have a batch of data containing the fields listed above. At the end of the year, we want to summarize our registrant activities for the entire year, so we would remove redundancies in our data by removing any records with duplicate account numbers from month to month. What type of operation do we use in this scenario?

Join
Not an Operation
Subsetting
Union

8. From the information given in question 7, what are the constraints, if any, which we have placed on the Account Number field for the end of year collection?

Account should have at most n digits.
If we had n duplicate Account Numbers then we will remove n-1 duplicate fields.
There are no constraints.
Account Number should be unique.

9. Suppose 100 people signup for our system and of the 100 people, 60 of them did not input an address. The system lists the values as NULL for these empty entries in the address field. Would this situation still have structure for our data?

No because the majority of data do not have a specific field filled, thus our originally defined structure is lost.
Yes the data has structure because we have placed a structural constraint on the data, thus the data will always have the originally defined structure.

Источник

Big Data Integration and Processing complete course is currently being offered by UC San Diego through Coursera platform and is Course 2 of 6 in the Big Data Specialization.

About this Course: This course is for those new to data science. Completion of Intro. to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications.

Also Check: How to Apply for Coursera Financial Aid

Big Data Modeling and Management Systems Week 1 Quiz Answers!

Q1. (Questions 1-3 pertain to the video lecture “Exploring
the Relational Data Model of CSV”) What is the approximate population of La Paz
County in the state of Arizona for the CENSUS2010POP (column H)? (Choose the
best answer.)

15000
25000
10000
20000

Q2. What county in the state of Wyoming has the smallest
estimated population?

Platte
Uinta
Niobrara
Sweetwater

Q3. At 2:45 of the video, the Instructor creates a filter
for all of the counties in California with a population greater than 1,000,000.
However, included in the results is the entire state of California. This
anomalous value might skew our analysis if, for example, we wanted to compute
the average population of these results. What additional filter might work to
resolve this problem?

Add
a filter to detect and remove results which do not include the word
“County” in column G.
Add
a filter which finds all counties with population greater than 100,000 AND
less than 10,000,000 for column H (CENSUS2010POP).
Add
a filter where the value in column E is greater than 1,000,000.
None
of the above

Q4. (Questions 4 and 5 pertain to the video “Exploring
Sensor Data”) How often (in seconds) do the R5 measurements occur?

Q5. What is the field for rain accumulation?

Q6. (Questions 6 and 7 pertain to the video lecture
“Exploring the Array Data Model of an Image”) What is the (Red, Green, Blue)
pixel value for location 500, 2000?

(163,
118, 79)
(134,
145, 46)
(50,
156, 182)
(100,
123, 149)

Q7. Is this value likely to be land or ocean?

Land
Ocean

Q8. (Questions 8 and 9 pertain to the video lecture
“Exploring the Semistructured Data Model of JSON”) Given a tweet, what path
would you most likely enter to obtain a count of the number of followers for a
user?

user/followers_count
user/statuses_count
user/listed_count
None
of the above

Q9. Which of the following fields are nested within the
‘entities’ field (select all that apply)?

tweets
user_mentions
events
views
symbols
urls

Big Data Modeling and Management Systems Week 2 Quiz Answers — Data Models Quiz Answers!

Q1. What is a possible pitfall of utilizing Excel as a way
to manipulate small databases?

Excel
does not enforce many principles of relational data models.
Excel
is a user program and thus cannot run on a server.
Excel
does not allow algorithms for data manipulation.

Q2. What does the term “atomic” mean in the context of
relational databases?

Fixed
schema of a particular database.
A
tuple that cannot be reduced.
A
column or row of data. Depends on the context.
One
unit of information that cannot be decomposed.

Q3. What is the Pareto-Optimality problem?

Find
the shortest path from source node to target node.
Find
the best possible path given two or more optimization criteria where
neither constraint can be fully optimized simultaneously.
Find
the optimal path that requires going through specific nodes given by the
user.

Q4. What constitutes a community within a graph?

High
density of nodes at a certain location.
A
neighborhood defined by an integer constant K around a specific node. All
K+1 nodes belong in another community.
A
dense amount of edge connections between nodes in a community and a few
connections across communities.
Many
anomalous neighborhoods within the same vicinity.

Q5. Why are trees useful for semi-structured data such as
XML and JSON?

Computers
can easily visualize the data with a tree structure.
It
is not always the case that XML and JSON can be represented as trees.
Trees
take advantage of the parent-child relationship of the data for easy
navigation.
They
are only useful for XML data as tree-like structure is apparent with tags.
While JSON does not contain a tree-like structure as it contains arrays.

Q6. What is the general purpose of modeling data as vectors?

Enables
weighting of the query.
The
ability to normalize vectors allowing probability distributions.
Enables
image searching.
Results
can be ordered by similarity using vector projection.

Q7. For the following questions 7, 8, and 9, suppose a
registration website creates data with the following fields for each person
registered (note: if the user does not input a value, NULL is stored instead):
Name, Date, Address, and Account Number.

Suppose we collect data month by month. Each month, we would
have a batch of data containing the fields listed above. At the end of the
year, we want to summarize our registrant activities for the entire year, so we
would remove redundancies in our data by removing any records with duplicate account
numbers from month to month. What type of operation do we use in this scenario?

Join
Not
an Operation
Subsetting
Union

Q8. From the information given in question 7, what are the
constraints, if any, which we have placed on the Account Number field for the
end of year collection?

Account
should have at most n digits.
If
we had n duplicate Account Numbers then we will remove n-1 duplicate
fields.
There
are no constraints.
Account
Number should be unique.

Q9. Suppose 100 people signup for our system and of the 100
people, 60 of them did not input an address. The system lists the values as
NULL for these empty entries in the address field. Would this situation still
have structure for our data?

No
because the majority of data do not have a specific field filled, thus our
originally defined structure is lost.
Yes
the data has structure because we have placed a structural constraint on
the data, thus the data will always have the originally defined structure.

Big Data Modeling and Management Systems Week 3 Quiz Answers: Data Formats and Streaming Data Quiz Answers!

Q1. What is true between data modeling and the formatting of
the data?

There
is a one to one correspondence between formatting data and data modeling.
For every model of data, there is only one way to store the data.
There
is always one specific schema for storing model data that is the best and
preferred method for the specific data representation.
The
data does not necessarily need to be formatted in a way that represents
the data model. Just so long as it can be extrapolated.

Q2. What is streaming?

Calculating
results using real time data otherwise known as streaming data.
Using
static data stored from a real time source in order to process and guide
the application.
Utilizing
real time data to compute and change the state of an application
continuously.
Using
sensors to manipulate the system, such as a smart car being able to drive
by itself using sensors to detect road hazards.

Q3. Of the following, what best describes the properties of
working with streaming data?

Small
time windows for working with data.
Data
is always utilized for streaming the application.
Data
manipulation is near real time.
Independent
computations that do not rely on previous or future data.
Always
unbounded in sequence, in other words, data is not guaranteed to be in
order.
Does
not ping the source interactively for a response upon receiving the data.

Q4. What is a characteristic of streaming data?

Data
is unbounded in size but requires only finite time and space to process
it.
The
data is unbounded in size and the size determines the time and space of
processing the data.
The
data is finite and requires only finite time and space to process the
data.
Data
is finite in size and size determines the time and space of processing the
data.

Q5. What type of algorithm is required for analyzing
streaming data?

Accurate
and Consistent
Accurate
and Memory Efficient
Fast
and Complex
Fast
and Simple

Q6. What is lambda architecture?

A
specific method for processing streaming data using special real time
processes.
A
specific hardware architecture for a server made specifically for
processing real time data.
A
method to process streaming data by utilizing batch processing and real
time processing.

Q7. Of the following, which best represents the challenge
regarding the size and frequency of data?

The
size and frequency of the streaming data may be too small.
The
size and frequency of the streaming data may be sporadic.
There
may not be data to produce the notion of size and frequency.

Q8. What is the difference between data lakes and data
warehouses?

Data
lakes house raw data while data warehouses contain pre-formatted data.
Data
lakes contain only files while data warehouses contain only databases.
Data
lakes utilize hierarchical systems while data warehouses use object
storage.

Q9. What is schema-on-read?

The
process where formatted data is given structure when read.
Another
name for data lakes.
Data
is stored as raw data until it is read by an application where the
application assigns structure.
The
process where data is pre-formatted prior to being read but the schema is
loaded on read.

Big Data Modeling and Management Systems Week 4 Quiz Answers:

BDMS Quiz Answers!

Q1. The desired characteristics of a BDMS include (select
all that apply):

Narrow
range of query sizes
Continuous
data ingestion
Support
for common “Big Data” data types
Support
for ACID
A
full query language
A
flexible semi-structured data model

Q2. Fill in the blank with the best answer: CAP theorem
states that _________ all at once within a distributed computer system?

it
is impossible to have consistency, accuracy, and partial tolerance
it
is necessary to have consistency, accuracy, and partial tolerance
it
is necessary to have consistency, availability, and partition tolerance
it
is impossible to have consistency, availability, and partition tolerance

Q3. What is the purpose of the acronym BASE?

The
same as ACID.
To
overcome CAP theorem.
To
impose properties on a BDMS in order to guarantee certain results.
Enables
stricter enforcement of ACID type design.

Q4. What are ziplists in Redis?

A
special type of data type that can store up to 512 mb of image data.
A
look up table that is stored as a value in the database. Look up table
points to actual values in memory.
A
compressed list that is stored within the value of the database.
A
special type of data type that can store hashes that point to multiple
attributes.

Q5. What is one of the main features of Aerospike?

Images
as values within the database.
Enables
real time data streaming from external sources.
Support
for geospatial data storage and geospatial queries.
Better
equipped for string based search applications.

Q6. What database would be best suited for the following
scenario: An app development company is trying to implement a cloud based
storage system for their new map-based app. The cloud will manage the longitude
and latitude of the data in order to track user location.

Solr
Vertica
Aerospike
Redis

Q7. What database would be best suited for the following
scenario: A big wholesale company is trying to implement a search engine for
their products.

Redis
Aerospike
Solr
Vertica

Q8. Which of the following data types are supported by
Redis? (select all that apply)

Sorted
Sets
Images
Hashes
Lists
Streaming
Video
Strings

Источник

Example: What is CanCollide in roblox?

-- CanCollide is command a that will be able to make a Part collide or not

-- To make an object not fall off a part 
script.Parent.CanCollide = true

-- To make an object fall off a part
script.Parent.CanCollide = false

Content Infrastructure for the Connected World

CONTENT IS DATA

TerminusCMS is an open-source headless content and knowledge management system. A dev-first enterprise knowledge graph to break down departmental knowledge silos. Incorporate content with operational and transactional data to discover and use organization-wide knowledge for your front-of-house and back-office front ends.

An organization-wide knowledge graph with the analytical power to unlock enterprise potential

Build a semantically connected content and knowledge model to curate cross-divisional data, content, and documentation. Data is stored as machine-readable JSON documents which are exposed as GraphQL and Datalog APIs for schema, query, and updates.

Demo

Admin UI

Model Schema

GraphQL API

Change Requests

Demo

Admin UI

Model Schema

GraphQL API

Change Requests

KNOWLEDGE & CONTENT MANAGEMENT

Back Office, Apps, Portals, Websites, & Analytics

TerminusCMS is an enterprise knowledge graph to make content, knowledge, and data discoverable and usable.

Greater Query Power

Graph queries leveraging semantic relationships and analytics engine powered by GraphQL & Datalog.

Schema as Code

Flexible and extendable JSON schema syntax to model semantically enriched content models with code.

Provenance & Version Control

Immutable data provides Git-like features such as branch, rebase, clone, rollback, and time-travel.

Change Request Workflows

Change request workflows built into the data layer to provide approval processes and security.

Interoperable Standards

Using JSON & RDF standards ensures interoperability across applications and devices.

100% Open Source

Choose a package that works for you. Self-host with our open-source install, or choose a hosted version, including dedicated compute resources.

USE CASES

Document properties,
ID, relationships,
key strategy, and
JSON view

Create teams &
data products to
work collaboratively.

Visual schema
builder & validator

Database admin &
query playground
tools

TERMINUSDB — THE DATABASE

An in-memory, distributed, and open-source document graph database for people who want the convenience of documents with the query power of graph relationships. For people who want data to be the star of their builds.

So much more than CMS

Get started in minutes and for free with our TerminusCMS Community Package. Clone an example from the dashboard to experiment and play today.

Источник

1. What is a possible pitfall of utilizing Excel as a way to manipulate small databases?

2. What does the term “atomic” mean in the context of relational databases?

3. What is the Pareto-Optimality problem?

4. What constitutes a community within a graph?

5. Why are trees useful for semi-structured data such as XML and JSON?

6. What is the general purpose of modeling data as vectors?

7. For the following questions 7, 8, and 9, suppose a registration website creates data with the following fields for each person registered (note: if the user does not input a value, NULL is stored instead): Name, Date, Address, and Account Number.

8. From the information given in question 7, what are the constraints, if any, which we have placed on the Account Number field for the end of year collection?

9. Suppose 100 people signup for our system and of the 100 people, 60 of them did not input an address. The system lists the values as NULL for these empty entries in the address field. Would this situation still have structure for our data?

Example: What is CanCollide in roblox?

Tags:

Lua Example

Related

Content Infrastructure for the Connected World

CONTENT IS DATA

An organization-wide knowledge graph with the analytical power to unlock enterprise potential

KNOWLEDGE & CONTENT MANAGEMENT

Back Office, Apps, Portals, Websites, & Analytics

Greater Query Power

Schema as Code

Provenance & Version Control

Change Request Workflows

Interoperable Standards

100% Open Source

USE CASES

TERMINUSDB — THE DATABASE

So much more than CMS