Bankim L. Radadiya1 and Parag Shukla2* data:image/s3,"s3://crabby-images/ab7d9/ab7d97bad725806cb86ac45a1c848f2e7c347649" alt=""
1Director IT- Navsari Agricultural University - Navsari - Gujarat India.
2Head-Department of MCA, Atmiya Institute of Technology and Science, Rajkot – 360005, India.
Corresponding author Email: paragshukla007@gmail.com
DOI : http://dx.doi.org/10.13005/ojcst/10.04.16
Article Publishing History
Article Received on : 4-12-2017
Article Accepted on : 11-12-2017
Article Published : 12 Dec 2017
Article Metrics
ABSTRACT:
Day by day, data is growing rapidly. Analysis of the data is necessity. As per recent survey data generated in last 2 years is more than the data created in entire previous history of human. Data created in different form and in diversified manner. It can be structured, it can be semi-structured, or it can be unstructured. To analyze diversified by agricultural data we can use the tools of Big Data like Pig. Using Pig, we can analyze varieties of data. Pig is a platform for analysis of data. Biggest advantage of Pig is it can process any diversified data very quickly and allows us to use user defined functions. Use Case of Pig is ETL. It is used to extract the data from sources then after applying transformation we can load it into the data warehouse. We will do analysis of state wise proportion circulation of Numeral of operative properties for all societal collections in 2005-06 and 2010-11 using Pig.
KEYWORDS:
Analysis; Agricultural data; Big Data Tools; Pig; Structured; Semi-Structured; Unstructured; Varieties.
Copy the following to cite this article:
Radadiya B. L, Shukla P. Analyzing Varieties of Agricultural Data Using Big Data Tools Pig. Orient.J. Comp. Sci. and Technol;10(4)
|
Copy the following to cite this URL:
Radadiya B. L, Shukla P. Analyzing Varieties of Agricultural Data Using Big Data Tools Pig. Orient. J. Comp. Sci. and Technol;10(4). Available from: http://www.computerscijournal.org/?p=7232
|
Introduction
Nowadays, data is growing very speedy. Analysis of the data is necessity for the many organization. As per recent survey data generated in last 2 years is more than the data created in entire previous history of human. Data created in different form and in diversified manner. It can be structured, it can be semi-structured, or it can be unstructured. To analyze diversified by agricultural data we can use the tools of Big Data like Pig. Using Pig, we can analyze varieties of data. Pig is a platform for analysis of data. Biggest advantage of Pig is it can process any diversified data very quickly and allows us to use user defined functions. Use Case of Pig is ETL. It is used to extract the data from sources then after applying transformation we can load it into the data warehouse.
Here, in this study we analyzed verities of agricultural data using the big data tools Pig.
What is Pig?
Why Pig? & What Pig Supports?
Analysis of Structured Agricultural Data Using Pig
To analyze structured data, first we must identify the source of data. Source of structured data can be any RDBMS like oracle, SQL Server, DB2, MySQL, Spreadsheets or OLTP Systems. Following are the source of structured data.
Step-1 Load the structured data.
We took the data of state wise proportion circulation of Numeral of operative properties for all societal collections in 2005-06 and 2010-11 from government website.1
Once retrieve the comma separated values file from government website, we copied the file on linux platform. Once we copied on linux then we moved the same file on HDFS platform. Following is command to move the file from linux root directory to HDFS directory named PARAG. Copy From Local command is used to move the file from linux directory to HDFS directory.
hadoop fs -copyFromLocal /root/state_data.csv /PARAG
Step-2 Display the loaded data
We can use dump statement to display the data in Grunt Shell.
Step-3 Filter Specific Data
For analysis of any data we can use filter or aggregate functions. Here, we are filtering the specific data from state Gujarat.
Finding all state data which census_small of 2005 is more than 30
Finding all state data which census_small of 2010 is more than 30
Analysis of Unstructured Agricultural Data Using Pig
Conclusion
We did analysis of agricultural data of state wise proportion circulation of Numeral of operative properties for all societal collections in 2005-06 and 2010-11 using Pig. We analyzed structured agricultural data using Pig. As we know that day by day requirement of analysis of the data is increasing rapidly. To demonstrate the use of analysis using big data tools Pig we used the government agricultural data and did the analysis of data.
Analysis of the data is necessity for the many organization. Data created in different form and in diversified manner. It can be structured, it can be semi-structured, or it can be unstructured. To analyze diversified by agricultural data we can use the tools of Big Data like Pig. Using Pig, we can analyze varieties of data. Pig is a platform for analysis of data. Biggest advantage of Pig is it can process any diversified data very quickly and allows us to use user defined functions. Use Case of Pig is ETL. It is used to extract the data from sources then after applying transformation we can load it into the data warehouse.
Acknowledgment
We wish to thank Open Government Data Platform (OGD) for providing data for analysis & sincere thanks to our mentor.
References
- https://data.gov.in/resources/state-wise-percentage-distribution-number-operational-holdings-all-social-groups-during
- Apache Pig, https://pig.apache.org/
- Apache Pig Architecture and components of Pig [online resource] https://www.tutorialspoint.com/apache_pig/apache_pig_architecture.htm
- Pig Philosophy, https://pig.apache.org/philosophy.html
- Hive Vs Pig [online resource] http://www.bigdataanalyst.in/hive-vs-pig/
- Big Data and Analytics – Wiley Publication, Seema Acharya, Subhashini Chellapan
- Dr. Birendra Goswami, Pradip Kumar Chandra “The Evolution Of Big Data As A Research And Development” International Journal of Scientific Research and Engineering Studies (IJSRES) Volume 2 Issue 3, March 2015 ISSN: 2349-8862
- Online Resource https://data.gov.in/
data:image/s3,"s3://crabby-images/52d37/52d37f6a1ef99f64de63c9e0ae1f59f132907bee" alt="Creative Commons License"
This work is licensed under a Creative Commons Attribution 4.0 International License.