{"id":1319,"date":"2014-01-03T20:32:41","date_gmt":"2014-01-03T11:32:41","guid":{"rendered":"http:\/\/kazu.tv\/blog\/?p=1319"},"modified":"2014-01-05T12:58:26","modified_gmt":"2014-01-05T03:58:26","slug":"mahout-0-8-clustering","status":"publish","type":"post","link":"https:\/\/kazu.tv\/blog\/2014\/01\/03\/mahout-0-8-clustering\/","title":{"rendered":"Mahout 0.8\u3067\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0"},"content":{"rendered":"<p>\u3053\u306a\u3044\u3060\u300c<a title=\"Mahout 0.8\u3067\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u30a8\u30f3\u30b8\u30f3\u3092\u4f5c\u308b\" href=\"http:\/\/kazu.tv\/blog\/2013\/12\/27\/mahout-0-8-recommender\/\">Mahout 0.8\u3067\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u30a8\u30f3\u30b8\u30f3\u3092\u4f5c\u308b<\/a>\u300d\u3068\u3044\u3046\u3001Mahout\u306e\u7c21\u5358\u306a\u30a8\u30f3\u30c8\u30ea\u30fc\u3092\u66f8\u3044\u305f\u3051\u3069\u3001\u4eca\u56de\u306f\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306e\u8a71\u3002<\/p>\n<p>\u4eca\u56de\u3082\u3001Mahout\u306e\u57fa\u672c\u77e5\u8b58\u306f\u77e5\u3063\u3066\u3044\u308b\u3053\u3068\u3092\u524d\u63d0\u3068\u3057\u3066\u66f8\u304f\u3002Mahout in Action\u304c\u5165\u9580\u5411\u3051\u3067\u826f\u3044\u304b\u3068\uff08\u30da\u30fc\u30b8\u672b\u5c3e\u53c2\u7167\uff09\u3002<\/p>\n<p><!--more--><\/p>\n<h2>\u300c\u3053\u3093\u306a\u3082\u306e\u3082\u8cb7\u3063\u3066\u3044\u307e\u3059\u300d\u3092\u51fa\u3057\u305f\u3044<\/h2>\n<p>\u7528\u9014\u306f\u3001\u30e6\u30fc\u30b6\u30fc\u304c\u3042\u308b\u30a2\u30a4\u30c6\u30e0\u3092\u95b2\u89a7\u3057\u305f\u6642\u306b\u3001\u300c\u3053\u306e\u30a2\u30a4\u30c6\u30e0\u3092\u8cb7\u3063\u305f\u4eba\u306f\u3053\u3093\u306a\u306e\u3082\u8cb7\u3063\u3066\u307e\u3059\u300d\u307f\u305f\u3044\u306a\u306e\u3092\u51fa\u3059\u3053\u3068\u3002<\/p>\n<h3>ItemSimilarity?<\/h3>\n<p>\u524d\u56de\u306e\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u30a8\u30f3\u30b8\u30f3\u306e\u5834\u5408\u306f\u3001\u5927\u96d1\u628a\u306b\u4ee5\u4e0b\u306e\u624b\u9806\u3067\u30aa\u30b9\u30b9\u30e1\u3092\u4f5c\u3063\u305f\u3002<\/p>\n<ol>\n<li>user &#8211; item &#8211; preference\u306e\u30c7\u30fc\u30bf\u3092DataModel\u306b\u8aad\u307f\u8fbc\u3080<\/li>\n<li>ItemSimilarity\u304bUserSimilarity\u3092\u8a08\u7b97<\/li>\n<li>\u985e\u4f3c\u5ea6\u3092\u5143\u306b\u3001Recommender\u304c\u30aa\u30b9\u30b9\u30e1\u3092\u8a08\u7b97<\/li>\n<\/ol>\n<p>\u4eca\u56de\u306e\u300c\u3053\u3093\u306a\u3082\u306e\u3082\u8cb7\u3063\u3066\u307e\u3059\u300d\u3092\u5b9f\u88c5\u3059\u308b\u306b\u3042\u305f\u3063\u3066\u3001\u6700\u521d\u306f\u3001\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u30a8\u30f3\u30b8\u30f3\u306e\u30b9\u30c6\u30c3\u30d72\u3067\u4f5c\u308bItemSimilarity\u3092\u305d\u306e\u307e\u307e\u4f7f\u304a\u3046\u3068\u601d\u3063\u305f\u3002\u30b9\u30c6\u30c3\u30d72\u3067\u8a08\u7b97\u3055\u308c\u305fSimilarity\u306e\u5024\u304c\u9ad8\u3044\u3082\u306e\u304b\u3089\u9806\u306b\u8868\u793a\u3057\u3066\u3044\u3051\u3070\u305d\u308c\u3067\u7d42\u308f\u308a\u304b\u306a\u3068\u3002<\/p>\n<p>\u7d50\u679c\u304b\u3089\u8a00\u3046\u3068\u3001\u81ea\u5206\u9054\u304c\u6301\u3063\u3066\u3044\u308b\u5143\u30c7\u30fc\u30bf\u304b\u3089\u8a08\u7b97\u3057\u305fSimilarity\u3060\u3068\u4f7f\u3048\u305d\u3046\u306a\u30c7\u30fc\u30bf\u306b\u306f\u306a\u3089\u306a\u304b\u3063\u305f\u3002\u4e00\u756a\u554f\u984c\u3060\u3063\u305f\u306e\u306f\u3001\u4eba\u6c17\u306e\u30a2\u30a4\u30c6\u30e0\u306e\u5834\u5408\u3001\u5225\u306e\u4eba\u6c17\u306e\u30a2\u30a4\u30c6\u30e0\u304c\u8868\u793a\u3055\u308c\u3066\u3057\u307e\u3063\u305f\u3053\u3068\u3002\u305d\u3093\u306a\u30c7\u30fc\u30bf\u3092\u4f7f\u3046\u4f4d\u3060\u3063\u305f\u3089\u5358\u7d14\u306b\u300c\u58f2\u308a\u4e0a\u3052\u30e9\u30f3\u30ad\u30f3\u30b0\u300d\u3068\u304b\u300c\u95b2\u89a7\u6570\u30e9\u30f3\u30ad\u30f3\u30b0\u300d\u3068\u304b\u3092\u898b\u308c\u3070\u4e8b\u304c\u8db3\u308a\u308b\u306e\u3067\u3001\u308f\u3056\u308f\u3056\u65b0\u305f\u306b\u300c\u3053\u3093\u306a\u3082\u306e\u3082\u8cb7\u3063\u3066\u3044\u307e\u3059\u300d\u6a5f\u80fd\u3092\u4f5c\u308b\u5fc5\u8981\u304c\u306a\u3044\u306a\u3068\u3002<\/p>\n<h2>\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0<\/h2>\n<h3>\u6301\u3063\u3066\u3044\u308b\u751f\u30c7\u30fc\u30bf<\/h3>\n<p>\u6b21\u306b\u3001\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u3092\u8a66\u3057\u3066\u307f\u308b\u3053\u3068\u306b\u3057\u305f\u3002\uff08\u5b9f\u969b\u306e\u672c\u756a\u30c7\u30fc\u30bf\u3068\u306f\u9055\u3046\u3051\u3069\uff09\u4ee5\u4e0b\u306e\u69d8\u306a\u30c7\u30fc\u30bf\u3092\u6301\u3063\u3066\u3044\u308b\u3068\u4eee\u5b9a\u3059\u308b\u3002<\/p>\n<ol>\n<li>\u30e6\u30fc\u30b6\u30fc\u306e\u305d\u306e\u30a2\u30a4\u30c6\u30e0\u306b\u5bfe\u3059\u308b\u8a55\u4fa1\uff08\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u3067\u4f7f\u308f\u308c\u308bpreferences\uff09<\/li>\n<li>\u30e6\u30fc\u30b6\u30fc\u304c\u30a2\u30a4\u30c6\u30e0\u306b\u5bfe\u3057\u3066\u81ea\u7531\u306b\u30bf\u30b0\u4ed8\u3051\u51fa\u6765\u308b\uff08\u300c\u3082\u306e\u306e\u3051\u59eb\u300d\u3060\u3063\u305f\u3089\u300c\u30a2\u30cb\u30e1\u300d\u300c\u30b8\u30d6\u30ea\u300d\u300c\u5bae\ufa11\u99ff\u300d\u3068\u304b\uff09<\/li>\n<li>\u30a2\u30a4\u30c6\u30e0\u306e\u5404\u7a2e\u5c5e\u6027\u60c5\u5831\uff08\u30b8\u30e3\u30f3\u30eb\u3001\u5024\u6bb5\u3001etc\uff09<\/li>\n<\/ol>\n<h3>\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306e\u6d41\u308c<\/h3>\n<p>\u5404\u30a2\u30a4\u30c6\u30e0\u306b\u5bfe\u3057\u3066\u30012\u30683\u306e\u60c5\u5831\u3067\u30d9\u30af\u30c8\u30eb\u3092\u4f5c\u6210\u3059\u308b\u3053\u3068\u306b\u3057\u305f\u3002\u4ee5\u4e0b\u3001\u8a71\u3092\u7c21\u7565\u5316\u306e\u305f\u3081\u306b\u3001\uff08\u904b\u55b6\u5074\u304c\u4f5c\u6210\u3059\u308b\uff09\u30b8\u30e3\u30f3\u30eb\u30c7\u30fc\u30bf\u3068\uff08\u30e6\u30fc\u30b6\u30fc\u304c\u81ea\u7531\u306b\u4f5c\u308c\u308b\uff09\u30bf\u30b0\u30c7\u30fc\u30bf\u3092\u4f7f\u3046\u3001\u3068\u3044\u3046\u8a2d\u5b9a\u306b\u3059\u308b\u3002<\/p>\n<p>\u4f8b\u3048\u3070\u3001\u30b7\u30b9\u30c6\u30e0\u5185\u90e8\u3067\u30b8\u30e3\u30f3\u30eb\u304c3\u500b\uff08\u30a2\u30cb\u30e1\u3001\u30db\u30e9\u30fc\u3001\u305d\u306e\u4ed6\uff09\u306b\u5206\u304b\u308c\u3066\u3044\u3066\u3001\u30e6\u30fc\u30b6\u30fc\u304c\u4eca\u307e\u3067\u306b5\u7a2e\u985e\u306e\u30bf\u30b0\uff08\u30a2\u30cb\u30e1\u3001\u5bae\ufa11\u99ff\u3001\u30b9\u30d7\u30e9\u30c3\u30bf\u30fc\u3001\u30c7\u30fc\u30c8\u5411\u304d\u3001\u540d\u4f5c\uff09\u3092\u4f5c\u6210\u3057\u305f\u3068\u3059\u308b\u3068\u30013 + 5 = 8\u6b21\u5143\u306e\u30d9\u30af\u30c8\u30eb\u304c\u51fa\u6765\u308b\u3002\u300c\u3082\u306e\u306e\u3051\u59eb\u300d\u306e\u5834\u5408\u3001\u4ee5\u4e0b\u306e\u3088\u3046\u306a\u30c7\u30fc\u30bf\u304c\u3042\u308b\u3068\u4eee\u5b9a\u3059\u308b\u3002<\/p>\n<ul>\n<li>\u30b8\u30e3\u30f3\u30eb\u304c\u300c\u30a2\u30cb\u30e1\u300d<\/li>\n<li>\u300c\u30a2\u30cb\u30e1\u300d\u3001\u300c\u5bae\ufa11\u99ff\u300d\u3001\u300c\u540d\u4f5c\u300d\u306e\u30bf\u30b0\u3092\u4ed8\u3051\u305f\u4eba\u304c\u305d\u308c\u305e\u308c5,4,3\u4eba\u3044\u305f\u3068\u3059\u308b<\/li>\n<\/ul>\n<p>\u3053\u306e\u5834\u5408\u3001\u3082\u306e\u306e\u3051\u59eb\u306e8\u6b21\u5143\u7a7a\u9593\u3067\u306e\u4f4d\u7f6e\u306f(1, 0, 0, 5, 4, 0, 0, 3)\u3068\u306a\u308b\u3002<\/p>\n<p>\u3053\u308c\u3068\u540c\u3058\u65b9\u6cd5\u3067\u5404\u6620\u753b\u306e\u30d9\u30af\u30c8\u30eb\u3092\u751f\u6210\u3057\u3001\u5f8c\u306f\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306b\u304b\u3051\u308c\u3070OK\u3002<\/p>\n<p>\u30b8\u30e3\u30f3\u30eb\u306e1\u3068\u30bf\u30b0\u306e1\u306f\u610f\u5473\u304c\u9055\u3046\u3051\u3069\u3044\u3044\u306e\uff1f\u3063\u3066\u601d\u3046\u4eba\u304c\u3044\u308b\u304b\u3082\u3057\u308c\u306a\u3044\uff08\u524d\u8005\u306fboolean\u306etrue\u3068\u3044\u3046\u610f\u5473\u30671\u3067\u3001\u5f8c\u8005\u306f\u4eba\u6570\u3001\u3064\u307e\u308a\u6570\u91cf\uff09\u3002\u305d\u308c\u306f\u6b63\u3057\u3044\u6307\u6458\u306a\u3093\u3060\u3051\u3069\u3001\u305d\u306e\u8fba\u308a\u3082Mahout in Action\u306b\u66f8\u3044\u3066\u3042\u308b\u306e\u3067\u3001\u672c\u3092\u8aad\u3093\u3067\u4e0b\u3055\u3044\u3002<\/p>\n<h2>\u30d7\u30ed\u30b0\u30e9\u30e0<\/h2>\n<p>\u7406\u5c48\u3060\u3068\u7c21\u5358\u306a\u3093\u3060\u3051\u3069\u3001Mahout in Action\u306e\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306e\u90e8\u5206\u306e\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u304c\u5206\u304b\u308a\u306b\u304f\u304f\u3066\u7d50\u69cb\u60a9\u3093\u3060\u3002<\/p>\n<h3>\u30d9\u30af\u30c8\u30eb\u306e\u4f5c\u6210<\/h3>\n<p>\u30d9\u30af\u30c8\u30eb\u30c7\u30fc\u30bf\u306e\u5b9f\u88c5\u306f\u4ee5\u4e0b\u306e3\u3064\u304c\u3042\u308b\u3002<\/p>\n<ul>\n<li>DenseVector<\/li>\n<li>RandomAccessSparseVector<\/li>\n<li>SequentialAccessSparseVector<\/li>\n<\/ul>\n<p>\u307e\u3041\u540d\u524d\u304b\u3089\u5206\u304b\u308b\u3068\u601d\u3046\u3051\u3069\u3001\u6700\u521d\u306e\u3084\u3064\u306f\u30c7\u30fc\u30bf\u304c\u5bc6\u306a\u30d9\u30af\u30c8\u30eb\u3002\u5168\u3066\uff08\u3042\u308b\u3044\u306f\u307b\u307c\u5168\u3066\uff09\u306e\u6b21\u5143\u306b\u5024\u304c\u5165\u3063\u3066\u3044\u308b\u5834\u5408\u306f\u3053\u308c\u3092\u4f7f\u3046\u3002\u6b8b\u308a\u306e2\u3064\u304c\u758e\u306a\u30d9\u30af\u30c8\u30eb\u3002\u4eca\u56de\u306f\u5f8c\u8005\uff08\u758e\u306a\u30d9\u30af\u30c8\u30eb\uff09\u3002\u758e\u306a\u30d9\u30af\u30c8\u30eb\u306b\u30822\u3064\u3042\u3063\u3066\u3001random access\u304bsequential access\u304b\u3092\u9078\u3076\u3093\u3060\u3051\u3069\u3001\u4eca\u56de\u306fsequential access \u3067OK\u3002<\/p>\n<p>\u3068\u3044\u3046\u3053\u3068\u3067\u3001\u4eca\u56de\u306fSequentialAccessSparseVector\u3092\u4f7f\u3063\u3066\u4ee5\u4e0b\u306e\u3088\u3046\u306b\u30d9\u30af\u30c8\u30eb\u3092\u4f5c\u308b\u3002\u4ee5\u4e0b\u3001\u30b3\u30fc\u30c9\u30b5\u30f3\u30d7\u30eb\u306fScala\u3060\u3051\u3069\u3001Java\u3057\u304b\u77e5\u3089\u306a\u3044\u4eba\u3067\u3082\u4f55\u3068\u306a\u304f\u5206\u304b\u308b\u306f\u305a\u3002<\/p>\n<pre class=\"brush: scala\">val features = new SequentialAccessSparseVector(8)<\/pre>\n<p>\u3067\u3001\u305d\u306e\u30d9\u30af\u30c8\u30eb\u306b\u5024\u3092\u30bb\u30c3\u30c8\u3057\u3066\u3044\u304f\u3002<\/p>\n<pre class=\"brush: scala\">features.set(0, 1.0) \/\/ \u300c\u30a2\u30cb\u30e1\u300d\u30b8\u30e3\u30f3\u30eb\r\nfeatures.set(3, 5.0) \/\/ \u300c\u30a2\u30cb\u30e1\u300d\u30bf\u30b0\r\nfeatures.set(4, 4.0) \/\/ \u300c\u5bae\ufa11\u99ff\u300d\u30bf\u30b0\r\nfeatures.set(7, 3.0) \/\/ \u300c\u540d\u4f5c\u300d\u30bf\u30b0<\/pre>\n<p>\u3053\u3053\u307e\u3067\u304c\u300c\u3082\u306e\u306e\u3051\u59eb\u300d\u306e\u30c7\u30fc\u30bf\u3002\u3053\u308c\u3092NamedVector\u3068\u3044\u3046\u3082\u306e\u306b\u683c\u7d0d\u3059\u308b\u3002<\/p>\n<pre class=\"brush: scala;\">val namedVector = new NamedVector(features, \"\u3082\u306e\u306e\u3051\u59eb\")<\/pre>\n<p>\u3053\u308c\u3092\u5404\u30a2\u30a4\u30c6\u30e0\u3054\u3068\u306b\u5b9f\u884c\u3057\u3001\u305d\u308c\u3092java.util.List\u7b49\u306b\u683c\u7d0d\u3057\u3066\u304a\u304f\u3002<\/p>\n<pre class=\"brush: scala;\">val allItems = new ListBuffer[NamedVector]() \/\/\u5168\u3066\u306e\u30a2\u30a4\u30c6\u30e0\u306e\u30d9\u30af\u30c8\u30eb\u3092\u683c\u7d0d\u3059\u308b\u5165\u308c\u7269\r\nallItems += namedVector \/\/ \u3082\u306e\u306e\u3051\u59eb\u306e\u30c7\u30fc\u30bf\u3092\u683c\u7d0d<\/pre>\n<h3>\u30d5\u30a1\u30a4\u30eb\u306b\u66f8\u304d\u51fa\u3057<\/h3>\n<p>Mahout 0.6\u306e\u9803\u3060\u3068\u3001\u30a4\u30f3\u30e1\u30e2\u30ea\u3067\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u3092\u884c\u3046KmeansClusterer\u3092\u4f7f\u3046\u3068\u3044\u3046\u9078\u629e\u80a2\u3082\u3042\u3063\u305f\u3093\u3060\u3051\u3069\u3001Mahout 0.7\u3067\u305d\u306e\u30af\u30e9\u30b9\u304c\u524a\u9664\u3055\u308c\u305f\u305f\u3081\u3001MapReduce\u3092\u4f7f\u308f\u306a\u3051\u308c\u3070\u3044\u3051\u306a\u3044\u3002\u3068\u306f\u8a00\u3048\u3001Hadoop\u30af\u30e9\u30b9\u30bf\u30fc\u3092\u4f5c\u3089\u306a\u304f\u3066\u3082\u3001pseudo mode\u3067\u3082OK\u3002<\/p>\n<p>MapReduce\u3092\u4f7f\u3046\u306e\u3067\u3001\u30d5\u30a1\u30a4\u30eb\u306b\u66f8\u304d\u51fa\u3059\u5fc5\u8981\u304c\u3042\u308b\u3002<\/p>\n<pre class=\"brush: scala;\">import org.apache.hadoop.conf.Configuration\r\nimport org.apache.hadoop.fs.FileSystem\r\nimport org.apache.hadoop.fs.Path\r\nimport org.apache.hadoop.io.SequenceFile\r\nimport org.apache.mahout.math.VectorWritable\r\n\r\n\/\/ \u524d\u6e96\u5099\r\nval conf = new Configuration\r\nval fs = FileSystem.get(conf)\r\nval outputDir = \"outdir\" \/\/ \u51fa\u529b\u5148\u30c7\u30a3\u30ec\u30af\u30c8\u30ea\r\nval vectorsFolder = new Path(outputDir, \"vectors\") \/\/ \u30d9\u30af\u30c8\u30eb\u306e\u51fa\u529b\u5148\r\nval centroidsFolder = new Path(outputDir, \"centroids\") \/\/ \r\nval clustersFolder = new Path(outputDir, \"clusters\") \/\/ \u8a08\u7b97\u7d50\u679c\u306e\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u51fa\u529b\u5148\r\n\r\nval writer = new SequenceFile.Writer(fs, conf, vectorsFolder, classOf[Text], classOf[VectorWritable])\r\nval vectorWritable = new VectorWritable()\r\n\r\n\/\/ \u66f8\u304d\u51fa\u3057\r\nfor (vector &lt;- allItems) {\r\n\u00a0 vectorWritable.set(vector)\r\n\u00a0 writer.append(new Text(vector.getName()), vectorWritable)\r\n}<\/pre>\n<h3>\u30af\u30e9\u30b9\u30bf\u8a08\u7b97\u306e\u5b9f\u884c<\/h3>\n<p>\u3042\u3068\u306f\u3001\u4e2d\u5fc3\u70b9\u3092\u30e9\u30f3\u30c0\u30e0\u306b\u9078\u629e\u3057\u3066\u3001K-means clustering\u3092\u5b9f\u884c\u3059\u308b<\/p>\n<pre class=\"brush: scala;\">val numOfIterations = 50 \/\/ \u9069\u5f53\u306b\r\nval k = 10 \/\/ \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u6570\r\nRandomSeedGenerator.buildRandom(conf, vectorsFolder, centroidsFolder, k,\r\n\u00a0\u00a0\u00a0 new CosineDistanceMeasure()) \/\/ \u8272\u3005\u306a\u8ddd\u96e2\u95a2\u6570\u3092\u8a66\u3059\u3068\u826f\u3044\u304b\u3082\r\n\r\n\/\/ \u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306e\u5b9f\u884c\u3002\u7d50\u679c\u306f\u30d5\u30a1\u30a4\u30eb\u306b\u66f8\u304d\u8fbc\u307e\u308c\u308b\u3002\r\n\/\/ \u5f15\u6570\u306fjavadoc\u3092\u53c2\u7167\r\nKMeansDriver.run(vectorsFolder, centroidsFolder, clustersFolder,\r\n\u00a0\u00a0\u00a0 new CosineDistanceMeasure(), 0.01, numOfIterations, true, 0, true)<\/pre>\n<h3>\u7d50\u679c\u306e\u8aad\u307f\u8fbc\u307f<\/h3>\n<p>\u5b9f\u884c\u7d50\u679c\u306f\u30d5\u30a1\u30a4\u30eb\u306b\u51fa\u529b\u3055\u308c\u308b\u306e\u3067\u3001\u305d\u308c\u3092\u8aad\u307f\u8fbc\u3080\u3002<\/p>\n<pre class=\"brush: scala;\">\/\/ \u624b\u629c\u304d\u3057\u3066\u30d5\u30a1\u30a4\u30eb\u540d\u6c7a\u3081\u6253\u3061\u306b\u3057\u3066\u308b\u30fb\u30fb\u30fb\r\nval reader = new SequenceFile.Reader(fs,\r\n\u00a0\u00a0\u00a0 new Path(clustersFolder, Cluster.CLUSTERED_POINTS_DIR + \"\/part-m-0\"), conf)\r\nval key = new Text()\r\nval value = new WeightedVectorWritable()\r\n\r\nwhile (reader.next(key, value)) {\r\n\u00a0 val namedVector: NamedVector = value.getVector().asInstanceOf[NamedVector]\r\n\u00a0 println(key + \"\\t\" + namedVector.getDelegate());\r\n}\r\nreader.close();<\/pre>\n<h2>\u307e\u3068\u3081<\/h2>\n<p>K-means \u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306f\u3001\u7406\u5c48\u306f\u7c21\u5358\u306a\u3093\u3060\u3051\u3069\u3001\u3058\u3083Mahout\u3067\u3069\u3046\u5b9f\u88c5\u3059\u308b\u304b\u3068\u306a\u308b\u3068\u3001\u610f\u5916\u306b\u8ff7\u3063\u3066\u3057\u307e\u3046\u3002<\/p>\n<p>\u30d9\u30af\u30c8\u30eb\u30c7\u30fc\u30bf\u306e\u5b9f\u88c5\u306f3\u7a2e\u985e\u3042\u308b\u306e\u3067\u9069\u5207\u306a\u3082\u306e\u3092\u4f7f\u7528\u3059\u308b\u3002<\/p>\n<p>Mahout 0.7\u4ee5\u964d\u3067\u306fin memory\u306e\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u304c\u51fa\u6765\u306a\u304f\u306a\u3063\u305f\u306e\u3067\u3001Hadoop\u306e\u64ec\u4f3c\u5206\u6563\u30e2\u30fc\u30c9\u3092\u4f7f\u7528\u3059\u308b\u3002\u5177\u4f53\u7684\u306b\u306f\u3001\u4e00\u5ea6\u30d5\u30a1\u30a4\u30eb\u306b\u66f8\u304d\u51fa\u3057\u3066KMeansDriver\u3092\u5b9f\u884c\u3002<\/p>\n<p>\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306b\u95a2\u3057\u3066\u3082\u3001\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u3068\u540c\u69d8\u306b\u3001\u7d50\u679c\u3092\u307f\u3064\u3064\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0\u3057\u3066\u3044\u304f\u5fc5\u8981\u304c\u3042\u308b\u3002<br \/>\n<iframe loading=\"lazy\" style=\"width: 120px; height: 240px;\" src=\"http:\/\/rcm-fe.amazon-adsystem.com\/e\/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=kblog-22&amp;o=9&amp;p=8&amp;l=as4&amp;m=amazon&amp;f=ifr&amp;ref=ss_til&amp;asins=4873115841\" height=\"240\" width=\"320\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\"><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u3053\u306a\u3044\u3060\u300cMahout 0.8\u3067\u30ec\u30b3\u30e1\u30f3\u30c7\u30fc\u30b7\u30e7\u30f3\u30a8\u30f3\u30b8\u30f3\u3092\u4f5c\u308b\u300d\u3068\u3044\u3046\u3001Mahout\u306e\u7c21\u5358\u306a\u30a8\u30f3\u30c8\u30ea\u30fc\u3092\u66f8\u3044\u305f\u3051\u3069\u3001\u4eca\u56de\u306f\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306e\u8a71\u3002 \u4eca\u56de\u3082\u3001Mahout\u306e\u57fa\u672c\u77e5\u8b58\u306f\u77e5\u3063\u3066\u3044\u308b\u3053\u3068\u3092\u524d\u63d0\u3068\u3057\u3066\u66f8\u304f\u3002Mahou&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[8],"tags":[920],"class_list":["post-1319","post","type-post","status-publish","format-standard","hentry","category-8","tag-mahout"],"_links":{"self":[{"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/posts\/1319","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/comments?post=1319"}],"version-history":[{"count":6,"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/posts\/1319\/revisions"}],"predecessor-version":[{"id":1343,"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/posts\/1319\/revisions\/1343"}],"wp:attachment":[{"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/media?parent=1319"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/categories?post=1319"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kazu.tv\/blog\/wp-json\/wp\/v2\/tags?post=1319"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}