如何绕过happybase“TApplicationException:内部错误处理mutateRows”错误?

时间:2022-10-14 17:25:25

I'm using happybase to connect to my Hbase database. I made a sample table called 'irisSample'. Here's the part of the code that I'm having trouble with-

我正在使用happybase连接到我的Hbase数据库。我制作了一个名为“irisSample”的示例表。这是我遇到麻烦的部分代码

import happybase
from happybase import *
import json

connection = happybase.Connection('<ip-address>', '9090')

table = connection.table('irisSample')

n = 0
x = 1
for u in y:
    data = {'petalWidth':points['petalWidth'][n], 'sepalLength':points['sepalLength'][n], 
          'petalLength':points['petalLength'][n], 'label': u}
    row = 'row' + str(x)
    table.put(row, {'flowers': str(data)})
    n += 1
    x += 1

And I get the following.

我得到了下面的结果。

TApplicationException                     Traceback (most recent call last)
<ipython-input-15-94741a8b04dc> in <module>()
      9           'petalLength':points['petalLength'][n], 'label': u}
     10     row = 'row' + str(x)
---> 11     table.put(row, {'flowers': str(data)})
     12     n += 1
     13     x += 1

/root/anaconda/lib/python2.7/site-packages/happybase/table.pyc in put(self, row, data, timestamp, wal)
    437         """
    438         with self.batch(timestamp=timestamp, wal=wal) as batch:
--> 439             batch.put(row, data)
    440 
    441     def delete(self, row, columns=None, timestamp=None, wal=True):

/root/anaconda/lib/python2.7/site-packages/happybase/batch.pyc in __exit__(self, exc_type, exc_value, traceback)
    130             return
    131 
--> 132         self.send()

/root/anaconda/lib/python2.7/site-packages/happybase/batch.pyc in send(self)
     53                      self._table.name, self._mutation_count, len(bms))
     54         if self._timestamp is None:
---> 55             self._table.connection.client.mutateRows(self._table.name, bms, {})
     56         else:
     57             self._table.connection.client.mutateRowsTs(

/root/anaconda/lib/python2.7/site-packages/happybase/hbase/Hbase.pyc in mutateRows(self, tableName, rowBatches, attributes)
   1574     """
   1575     self.send_mutateRows(tableName, rowBatches, attributes)
-> 1576     self.recv_mutateRows()
   1577 
   1578   def send_mutateRows(self, tableName, rowBatches, attributes):

/root/anaconda/lib/python2.7/site-packages/happybase/hbase/Hbase.pyc in recv_mutateRows(self)
   1592       x.read(self._iprot)
   1593       self._iprot.readMessageEnd()
-> 1594       raise x
   1595     result = mutateRows_result()
   1596     result.read(self._iprot)

TApplicationException: Internal error processing mutateRows

I've also tried json.dumps(data) instead of str(data) which threw the same exception.

我还尝试了json.dumps(data)而不是str(data),后者抛出了相同的异常。

From what I'm gathering it seems more like a Thrift problem but I could be wrong. I may have to look at starbase instead. I don't know, that's why I'm asking you guys.

从我收集的数据来看,这更像是一个节俭的问题,但我可能错了。我可能得去基地了。我不知道,这就是我问你们的原因。

2 个解决方案

#1


1  

It looks like in your "data" object you are not giving column family along with the column name.

看起来在您的“data”对象中,您不提供列族以及列名。

data = {'petalWidth':points['petalWidth'[n], 'sepalLength':points['sepalLength'][n], 'petalLength':points['petalLength'][n], 'label': u}

You need to have column family before each of the column values. E.g., say your column family is "cf" then instead of 'petalWidth' you want 'cf:petalWidth'.

您需要在每个列值之前都有一个列族。例如,假设你的列族是“cf”,而不是“petalWidth”,你想要“cf:petalWidth”。

data = {'cf:petalWidth':points['petalWidth'[n], 'cf:sepalLength':points['sepalLength'][n], 'cf:petalLength':points['petalLength'][n], 'cf:label': u}

Doing this fixed the mutateRows error for me.

这样做为我修正了mutateRows错误。

#2


-1  

I found the answer here. It required a REST framework to be running on an unused port. I then connected to that port and was able to see the list of all the tables in Hbase and insert data into the database. The command is ./hbase-daemon.sh start rest -p <unused port number>. I couldn't find the answer anywhere and no one even commented for a few days so I hope this helps you! Here's the code. I ended up using starbase.

我在这里找到了答案。它需要在未使用的端口上运行REST框架。然后我连接到那个端口,能够看到Hbase中所有表的列表,并将数据插入到数据库中。该命令。/ hbase-daemon。sh启动rest -p <未使用的端口号> 。我在任何地方都找不到答案,甚至有几天没有人评论,所以我希望这对你有帮助!这里的代码。最后我用了starbase。

import starbase
from starbase import Connection

c = Connection(host='<ip-address>', port=8000)

8000 is the default port by the way.

8000是默认端口。

t = c.table('irisSample')

for flower in range(len(labels)):
    data = {'petalWidth':X[flower][3], 'petalLength':X[flower][2], 
              'sepalLength':X[flower][0], 'sepalWidth':X[flower][1], 'cluster': labels[flower]}
    n += 1
    row = 'row' + str(n)
    t.insert(row, {'flowers': data})

#1


1  

It looks like in your "data" object you are not giving column family along with the column name.

看起来在您的“data”对象中,您不提供列族以及列名。

data = {'petalWidth':points['petalWidth'[n], 'sepalLength':points['sepalLength'][n], 'petalLength':points['petalLength'][n], 'label': u}

You need to have column family before each of the column values. E.g., say your column family is "cf" then instead of 'petalWidth' you want 'cf:petalWidth'.

您需要在每个列值之前都有一个列族。例如,假设你的列族是“cf”,而不是“petalWidth”,你想要“cf:petalWidth”。

data = {'cf:petalWidth':points['petalWidth'[n], 'cf:sepalLength':points['sepalLength'][n], 'cf:petalLength':points['petalLength'][n], 'cf:label': u}

Doing this fixed the mutateRows error for me.

这样做为我修正了mutateRows错误。

#2


-1  

I found the answer here. It required a REST framework to be running on an unused port. I then connected to that port and was able to see the list of all the tables in Hbase and insert data into the database. The command is ./hbase-daemon.sh start rest -p <unused port number>. I couldn't find the answer anywhere and no one even commented for a few days so I hope this helps you! Here's the code. I ended up using starbase.

我在这里找到了答案。它需要在未使用的端口上运行REST框架。然后我连接到那个端口,能够看到Hbase中所有表的列表,并将数据插入到数据库中。该命令。/ hbase-daemon。sh启动rest -p <未使用的端口号> 。我在任何地方都找不到答案,甚至有几天没有人评论,所以我希望这对你有帮助!这里的代码。最后我用了starbase。

import starbase
from starbase import Connection

c = Connection(host='<ip-address>', port=8000)

8000 is the default port by the way.

8000是默认端口。

t = c.table('irisSample')

for flower in range(len(labels)):
    data = {'petalWidth':X[flower][3], 'petalLength':X[flower][2], 
              'sepalLength':X[flower][0], 'sepalWidth':X[flower][1], 'cluster': labels[flower]}
    n += 1
    row = 'row' + str(n)
    t.insert(row, {'flowers': data})