摘要:前言的大樣本統計我們對到之間的整數進行采樣,并將結果存儲在數組中就是整數的采樣個數。我們以浮點數數組的形式,分別返回樣本的最小值最大值平均值中位數和眾數。
前言
Weekly Contest 142的 大樣本統計:
解題思路我們對 0 到 255 之間的整數進行采樣,并將結果存儲在數組 count 中:count[k] 就是整數 k 的采樣個數。
我們以 浮點數 數組的形式,分別返回樣本的最小值、最大值、平均值、中位數和眾數。其中,眾數是保證唯一的。
我們先來回顧一下中位數的知識:
如果樣本中的元素有序,并且元素數量為奇數時,中位數為最中間的那個元素;
如果樣本中的元素有序,并且元素數量為偶數時,中位數為中間的兩個元素的平均值。
示例1:
輸入:count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] 輸出:[1.00000,3.00000,2.37500,2.50000,3.00000]示例2:
輸入:count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] 輸出:[1.00000,4.00000,2.18182,2.00000,1.00000]提示:
count.length == 256
1 <= sum(count) <= 10^9
計數表示的眾數是唯一的
答案與真實值誤差在 10^-5 以內就會被視為正確答案
本地難度為中等,首先需要讀懂題目意思,本題的入參數組count其實算是一個壓縮數據后的數組。
我們對 0 到 255 之間的整數進行采樣,并將結果存儲在數組 count 中:count[k] 就是整數 k 的采樣個數。
簡單來說就是,數組count的第k個元素就是k在壓縮前的數組中出現count[k]個。以示例1的count為例
[0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
解壓的過程如下:
第0個元素為0,則解壓后的數組為[] 第1個元素為4,則解壓后的數組為[1,1,1,1] 第2個元素為3,則解壓后的數組為[1,1,1,1,2,2,2] 第3個元素為2,則解壓后的數組為[1,1,1,1,2,2,2,3,3] 第4個元素為2,則解壓后的數組為[1,1,1,1,2,2,2,3,3,4,4] ...... 省略后續步驟
搞清楚count的數據特征后,選擇使用TreeMap對count進行處理,將有效數字及其出現個數存儲起來(有效數字指的是count[k]不為0的元素)。根據就是根據題目要求分別處理以下指標:
最小值:TreeMap中第一個key
最大值:TreeMap中最后一個key
平均值:TreeMap的key之和除以value之和
中位數:
計算出數組實際的元素個數(即value之和)
根據元素個數的奇偶性,獲取對應的值
眾數:出現次數最多的數字,即TreeMap中value最大的鍵值對的key
實現代碼/** * 1093. 大樣本統計 * * @param count * @return */ public double[] sampleStats(int[] count) { // 使用TreeMap有序存儲數字及其出現次數 TreeMapcountMap = new TreeMap<>(); double[] result = new double[5]; // 總和 double sum = 0L; // 數字出現總次數 double total = 0L; // 最大出現次數 long maxTimes = 0; // 最小值 double min; // 最大值 double max; // 平均值 double average; // 中位數 double middle = 0; // 眾數,出現次數最多的數字 double mode = 0; for (int i = 0; i < count.length; i++) { if (count[i] != 0) { countMap.put(i, count[i]); sum = sum + i * count[i]; total += count[i]; if (count[i] > maxTimes) { maxTimes = count[i]; mode = i; } } } min = countMap.firstKey().doubleValue(); max = countMap.lastKey().doubleValue(); average = sum / total; // 是否為奇數 boolean odd = total % 2 != 0; // 中位數索引 int middleIndex = (int) ((total - 1) / 2); int index = -1; Iterator > it = countMap.entrySet().iterator(); while (it.hasNext()) { Map.Entry entry = it.next(); int num = entry.getKey(); int times = entry.getValue(); index += times; if (index > middleIndex) { middle = num; break; } else if (index == middleIndex) { if (odd) { middle = num; break; } else { middle = (num + it.next().getKey()) / 2.0; break; } } } result[0] = min; result[1] = max; result[2] = average; result[3] = middle; result[4] = mode; return result; }
文章版權歸作者所有,未經允許請勿轉載,若此文章存在違規行為,您可以聯系管理員刪除。
轉載請注明本文地址:http://specialneedsforspecialkids.com/yun/77875.html
摘要:樣本均值的方差是總體方差的為樣本容量,這個結論是針對有放回抽樣的。某些情況下配對樣本比較難實現,比如藥物雙盲試驗,患者不能既服用安慰劑又服用藥物。樣本方差和總體方差的比值,符合分布。 有放回?無放回? 從總體中隨機抽取一個容量為n的樣本,當樣本容量 n足夠大(通常要求n ≥30)時,無論總體是否符合正態分布,樣本均值都會趨于正態分布。期望和總體相同,方差為總體的1/n。這即是中心極限定...
摘要:問題是什么能拿來干什么如何求解深入理解是什么混淆矩陣混淆矩陣是理解大多數評價指標的基礎,毫無疑問也是理解的基礎。內容的召回往往是根據的排序而決定的。 問題: AUC是什么 AUC能拿來干什么 AUC如何求解(深入理解AUC) AUC是什么 混淆矩陣(Confusion matrix) 混淆矩陣是理解大多數評價指標的基礎,毫無疑問也是理解AUC的基礎。豐富的資料介紹著混淆矩陣的概念,...
摘要:確定分流方案使用各類平臺分配流量。備擇假設與零假設相反,即實驗者希望證實的假設。雖然該數據集的統計結果與支付寶的實際規模有偏差,但不影響解決方案的適用性。選定統計方法由于樣本較大,故采用檢驗。 ...
閱讀 994·2023-04-25 19:35
閱讀 2633·2021-11-22 09:34
閱讀 3679·2021-10-09 09:44
閱讀 1713·2021-09-22 15:25
閱讀 2931·2019-08-29 14:00
閱讀 3371·2019-08-29 11:01
閱讀 2595·2019-08-26 13:26
閱讀 1735·2019-08-23 18:08