Results for Data Science

Types of Digital Data | What is a Digital Data? Definition, Features and More | data science

January 01, 2021

Types of Digital Data:

Digital data, in information theory and information system, is the discrete, discontinuous representation of information or works. Numbers and letters are commonly used representations.

Types of Digital Data

1.  Structured data:

  • Structured data is data whose elements are addressable for effective analysis. It has been organized into a formatted repository that is typically a database.
  • It concerns all data which can be stored in database SQL in a table with rows and columns. They have relational keys and can easily be mapped into pre-designed fields.
  • Today, those data are most processed in the development and simplest ways to manage information.
  • Example: Relational data
  • Structured data depends on the existence of a data model - a model of how data can be stored, Processed and accessed.
  • Because of a data model, each field is discrete and can be accesses separately or jointly along with data form other fields.
  • This makes structured data extremely powerful:  it is possible to quickly aggregate data form various locations in the database.

 
Unstructured data:

  • Unstructured Data is a data that is not organized in a predefined manner or does not have a pre-defined data model, thus it is not a good fit for a mainstream relational database.
  • So for Unstructured data, there are alternative platform for storing and managing, it is increasingly prevalent in It system and is use by organizations in a variety of business intelligence and analytics applications.
  • Example: Word, PDF, Text, Media logs.
  • The ability to analyses unstructured data is especially relevant in the context of Big Data, since a large part of data in organizations is unstructured. Think about pictures, videos or PDF documents.
  • The ability to extract values form unstructured data is one of main drivers behind the quick growth of Big Data.

 

Semi-structured Data:

  • Semi-structured Data is information that does not reside in a relational database but that have some organizational properties that make it easier to analyze. With some process, you can store them in the relation database but, Semi-structured exist to ease space.
  • Example: XML data.
  • The reason that this third category exists is because semi-structured data is considerably easier to analyses than unstructured data. Many Big Data solutions and tools have the ability to read and process either JSON or XML. This reduces the complexity to analyses structured data, compared to unstructured data.

 
Difference between Structured, Semi-structured, Unstructured data.

Difference between Structured, Semi-structured, Unstructured data.

Structured DATA:

  • It is based on Relational database table
  • Matured transaction and various concurrency technique
  • Versioning over tuple, row, tables
  • It is schema dependent and less flexible
  • It is very difficult to scale DB schema
  • Very robust
  • Structured query allow complex joining

 

Semi-structured DATA:

  • It is based on XML/RDF
  • Transaction is adapted from DBMS not matured.
  • Versioning over tuples or graph is possible
  • It is more flexible then structured data but less then flexible than unstructured data
  • It’s scaling is simpler than structured data
  • New technology, not very spread
  • Queries over anonymous nodes are possible

 

Unstructured DATA:

  • It is based on character and binary data
  • No transaction management and no concurrency
  • Versioned as whole
  • It very flexible and there is absence of schema
  • It is very scalable
  • Only textural query are possible

Types of Digital Data | What is a Digital Data? Definition, Features and More | data science Types of Digital Data | What is a Digital Data? Definition, Features and More | data science Reviewed by technical_saurabh on January 01, 2021 Rating: 5

Data Science The 5 V's of Big Data | 5 ‘V’s of Big Data

January 01, 2021

5 ‘V’s of Big Data

  • The term big data emphasizes volume or size. Size is a relative term. In the 1960, 20 Megabytes was considered large. Now data is not considered big unless it is several hundred Petabytes (PB) (Petabyte = 1015 bytes). Size is not the only property used the describe big data.
  • In addition on volume, there are other important properties that we will discuss in what follows:

 

5 ‘V’s of Big Data

1. Volume:

  • Amount of global digital data created, replicated, and consumed in 2013 was estimated by the International Data Corporation (a company which publishes research reports) as 4.4 Zettabytes (ZB) (Zettabyte = 1021 bytes). It is doubling every 2 years.
  • By 2015, digital data grew to 8 ZB and is expected to grow to 12 ZB in 2016. To give an idea of ZB, it is the storage required to store 200 billion high definition movies which will take a person 24 million years to watch!

 

2. Variety:

  • In the 1960s, the predominant data types were numbers and text. Today, in addition to numbers and text, there are image, audio, and video data, Large Hadron collider (LHC), earth and polar observations generate mainly numeric data. Word processors, emails tweets, blogs, and other social media generate primarily unstructured textural data.
  • Medical images and billions of photographs which people take using their mobile phones are image data. Surveillance cameras and movies produce video data. Music sites store audio data. Most data in the 80s were structured and organized as tables with keys. Today there are unstructured and multimedia data often used together.

 

3. Velocity

  • Data in conventional databases used to change slowly. Now most data area real time. For example, phone conversations, data acquired form experiments. Data set by sensor, data exchanged using the Internet, and stock price data are all real time.
  • Large amount of data are transient and need to be analyzed as and when they are generated. They become irrelevant fast.

 

4. Veracity:

  • A lot of data generated are noisy, e.g., data form sensor. Data are often incorrect. For example, many websites you access may not have the correct information. It is difficult to be absolutely certain about the veracity of big data.

 

5. Value:

  • Data by itself is of no value unless it is processed to obtain information using which one may initiate actions. The large volume of data makes processing difficult fortunately; computing power and storage capacity have also increased enormously.
  • A huge number of inexpensive processor working in parallel has made it feasible to extract useful information to detect patterns from big data. Distributed file systems such as Hadoop Distributed File system (HDFS) coupled with parallel processing programs such as Map Reduce are associated with big data as software tools to derive value form big data.

Data Science The 5 V's of Big Data | 5 ‘V’s of Big Data Data Science  The 5 V's of Big Data | 5 ‘V’s of Big Data Reviewed by technical_saurabh on January 01, 2021 Rating: 5

Data Science | Components of Data Science | Application of Data science

January 01, 2021

Data Science

  • The fundamental concepts of data science are drawn from many fields that study data analytics.
  • Fundamental concepts: Extracting useful knowledge from data to solve business problems can be treated systematically.
  • Data scientists pay active roles in the design and implementation work of four related areas: data architecture. Data acquisition, data analysis, and data archiving.
  • Key skills highlighted by the brief case study include communication skills, data analysis skills, and ethical reasoning skills.

 

Components of Data Science:


Components of Data Science:

1.  Statistics:

  • Statistics is one of the most important components of data science.
  • Statistics is a way to collect and analyze the numerical data in a large amount and finding meaningful insights form it.

 

2. Visualization:

  • Data visualization is meant by representing data in a visual context so that people can easily understand the significance of data.
  • Data visualization makes it easy to access the huge amount of data visuals.

 

3. Data engineering:

  • Data engineering is a part science, which involves acquiring, storing, retrieving, and transforming the data.
  • Data engineering also includes metadata data about data to the data.

 

4. Advanced computing:

  • Heavy lifting of data science is advanced computing.
  • Advanced computing involves designing, writing, debugging, and maintaining the source code of computer programs.
  • 5. Machine learning:
  • Machine learning is all about to provide training to a machine so that it can act as human brain. In data science, we use various machine learning algorithms to solve the problems.

 

Advantage of Data Science:

  • Data science helps organizations knowing how and when their products sell best and that’s why the products are delivered always to the right place and right time.
  • Faster and better decisions are takes by the organization to improve efficiency and earn higher profits.
  • It helps the marketing and sales term of organization in understanding by refining and identifying the target audience.
  • It has made it comparatively easier to sort data and look for best of candidates for an organization. Big data and data mining have made processing and selection of CVs, aptitude tests and games easier for the recruitment teams.

 

Disadvantages of Data Science:

  • Extracted information form the structured as well as unstructured data for further use can also misused against a group of people of a county or some committee.
  • Tools used for the data science and analytics are more expensive to use to obtain information. The tools are also more complex, so people have to learn how to use them.

 

Application of Data science:

  • Fraud and Risk detection
  • Healthcare.
  • Virtual assistance for patients and customer support.
  • Internet Search
  • Targeted Advertising.
  • Website Recommendations
  • Advanced Image Recognition
  • Speech Recognition.
  • Airline Route Planning
  • Gaming
  • Augmented Reality.

Data Science | Components of Data Science | Application of Data science Data Science | Components of Data Science | Application of Data science Reviewed by technical_saurabh on January 01, 2021 Rating: 5
Powered by Blogger.