hadoop - Hive always gives "Number of reduce tasks determined at compile time: 1", no matter what I do -


create external table if not exists my_table (customer_id string,ip_id string) location 'ip_b_class'; 

and then:

hive> set mapred.reduce.tasks=50; hive> select count(distinct customer_id) my_table; total mapreduce jobs = 1 launching job 1 out of 1 number of reduce tasks determined @ compile time: 1 

there's 160gb in there, , 1 reducer takes long time...

[ihadanny@lvshdc2en0011 ~]$ hdu  found 8 items 162808042208   hdfs://horton/ip_b_class 

...

logically cannot have more 1 reducer here. unless distinct customer ids individual map tasks come 1 place distinctness can not established , single count can not produced. in other words unless heap customer ids in 1 place, cannot each 1 distinct , count them.


Popular posts from this blog

How to calculate SNR of signals in MATLAB? -

c# - Attempting to upload to FTP: System.Net.WebException: System error -

ios - UISlider customization: how to properly add shadow to custom knob image -