Top Rated Apache Pig Alternatives

It can handle some of the simple mathematical operations, along with reducing the data. Aggregating the data is extremely useful. Running DateTime functionalities in apache pig is really a useful feature for faster and quicker results. Pig works on datasets of around 150 to 180 GB per month and reduces them efficiently within say 10 to 12 minutes. I would definitely recommend Apache pig to any basic coding person in the field of transportation engineering to start using Apache pig, especially when you need to handle huge dataset. Review collected by and hosted on G2.com.
It cannot perform sequential operations, like taking consecutive lines and then comparing them. However, the workaround is to rank the segments, merge them and then perform the task. The main drawback still lies in the fact that it cannot be used to perform loops and nested loops across any variable(s). Hive might be a better choice in certain cases for that reason. Review collected by and hosted on G2.com.
20 out of 21 Total Reviews for Apache Pig
Overall Review Sentiment for Apache Pig
Log in to view review sentiment.

Apache Pig and its query language (Pig Latin) allowed us to create data pipelines with ease. The language is designed to reflect the way data pipelines are designed, so it discards extraneous data, supports user defined functions (UDFs) , and offers a lot of control over the data flow. Review collected by and hosted on G2.com.
Pig being a greedy language, will not evaluate data until it's actually needed. So errors are not visible unless you actually try to dump/print the data. There is no "debug" functionality to run the code in a dry-run mode. Review collected by and hosted on G2.com.
What I like best about Apache Pig how efficiently we can write any of our complex map reduce or spark jobs without having much deep knowledge of Java, Python, Groovy. Also, its easy to control the execution of job with the help of pig. Review collected by and hosted on G2.com.
What I dislike about Apache Pig is its error debugging consume most of its development time as it can be some times immature/unstable. Also the support community is very much less when compared to that of hadoop mapreduce or spark issues. Review collected by and hosted on G2.com.
It is easy to learn and get into production. It automates important MapReduce tasks into SQL kind queries. Review collected by and hosted on G2.com.
- Not all tasks in Big Data can be completed using pig. Review collected by and hosted on G2.com.
Less number of instructions does big tasks of collecting, loading, consolidating the data. Review collected by and hosted on G2.com.
Not enough tools to debug
Incorrect/misleading exceptions Review collected by and hosted on G2.com.

Apache Pig is a 1st pass compiler, which is at its best using DAG. Review collected by and hosted on G2.com.
If you want to drill down and use complex structures, it is not the best way. Review collected by and hosted on G2.com.
1. Ease of use, its performance
2. MapReduce is fully abstracted
3. Ability to chain multiple MR jobs into a single Pig script
4. Allows you quickly to crank through big data to get some analytics done Review collected by and hosted on G2.com.
1. Slower in performance compared to Spark
2. Less support e.g String concatenation only allows 2 at a time, cannot sort & filter inside Group BY, etc
3. Cannot read in other forms of input like csv as parquet, what Spark can do
4. Error handling needs to be better. Not easy to debug UDFs Review collected by and hosted on G2.com.
- SQL like syntax
- powerful and feature-rich Review collected by and hosted on G2.com.
- Much more difficult to use than Hive
- takes a while to get used to and learn as compared with Hive Review collected by and hosted on G2.com.

creating udaf's easily.
manageable and easy to write pig languages
can be streamed through python and scripted out vs writing an MR job Review collected by and hosted on G2.com.
not as truly scalable as writing MR job.
joins are easy, but not as easy as hive queries
doesn't handle parquet really well
not as fast and flexible as spark Review collected by and hosted on G2.com.

The ecosystem and the way it works. Being able to implement and integrate what you currently use. Review collected by and hosted on G2.com.
I think getting started is a bit patchy but once you're familiar and used to it, it can be very helpful. Review collected by and hosted on G2.com.