Albirew/nyaa-pantsu
Archivé
1
0
Bifurcation 0
Ce dépôt a été archivé le 2022-05-07. Vous pouvez voir ses fichiers ou le cloner, mais pas ouvrir de ticket ou de demandes d'ajout, ni soumettre de changements.
nyaa-pantsu/deploy/ansible/roles/elasticsearch/files/index_nyaapantsu.py
tomleb f22d11b35d Elasticsearch integration (WIP) (#730)
* Update mapping to be similar to TorrentJSON

* Implement ES search for TorrentParam

* Add seeders/leechers/completed to es index

* Fix filter, use analyzer

* Use ES for the search route

* Add upload_id filtering with ES

* Create/update ES index on torrent upload/update

* Delete from ES index on Delete

* Use ES everywhere, fallback to postgres query

Use Elasticsearch to search the index whenever a call to searchByQuery
is made. Big cleanup needed, but _it werks_.

* Only fetch ids from ES, nothing else

* Use ColumnUpdate instead of Save

* Add FIXME/info to search

* Template needs []TorrentJSON not []Torrent
2017-05-26 09:48:14 +10:00

61 lignes
1,9 Kio
Python

# coding: utf-8
from elasticsearch import Elasticsearch, helpers
import psycopg2, pprint, sys, time, os
CHUNK_SIZE = 10000
dbparams = ''
pantsu_index = ''
try:
dbparams = os.environ['PANTSU_DBPARAMS']
except:
print('[Error]: Environment variable PANTSU_DBPARAMS not defined.')
sys.exit(1)
try:
pantsu_index = os.environ['PANTSU_ELASTICSEARCH_INDEX']
except:
print('[Error]: Environment variable PANTSU_ELASTICSEARCH_INDEX not defined.')
sys.exit(1)
es = Elasticsearch()
pgconn = psycopg2.connect(dbparams)
cur = pgconn.cursor()
cur.execute("""SELECT torrent_id, torrent_name, category, sub_category, status,
torrent_hash, date, uploader, downloads, filesize
FROM torrents
WHERE deleted_at IS NULL""")
fetches = cur.fetchmany(CHUNK_SIZE)
while fetches:
actions = list()
for torrent_id, torrent_name, category, sub_category, status, torrent_hash, date, uploader, downloads, filesize in fetches:
# TODO Add seeds/leech
# TODO Consistent ID representation on the codebase
doc = {
'id': str(torrent_id),
'name': torrent_name.decode('utf-8'),
'category': str(category),
'sub_category': str(sub_category),
'status': status,
'hash': torrent_hash,
'date': date,
'uploader_id': uploader,
'downloads': downloads,
'filesize': filesize,
'seeders': 0, # TODO Get seeders from database
'leechers': 0, # TODO Get leechers from database
'completed': 0 # TODO Get completed from database
}
action = {
'_index': pantsu_index,
'_type': 'torrents',
'_id': torrent_id,
'_source': doc
}
actions.append(action)
helpers.bulk(es, actions, chunk_size=CHUNK_SIZE, request_timeout=120)
del(fetches)
fetches = cur.fetchmany(CHUNK_SIZE)