Albirew/nyaa-pantsu
Archivé
1
0
Bifurcation 0
Graphe des révisions

9 commits

Auteur SHA1 Message Date
tomleb
212027c6a6 Use minimum 1 ngram (ie: for Gantz O) 2017-05-31 21:20:58 -04:00
tomleb
ba683c3bcb Improve search and fix '*' in search box (#871)
* Improve ES search

The new performance is very good.
Some examples on my 1.5gb vm:
INFO[0153] Query 'shingeki' took 6 milliseconds.
INFO[0125] Query 'アニメ' took 17 milliseconds.
INFO[0102] Query 'shingeki -kyojin horrible ' took 12 milliseconds

Looking at the criteria we wanted here:
https://pad.riseup.net/p/i8DrilHDWRvf, it meets:

1. Fast: sub-100ms for a typical query, sub-50ms is good and sub-20ms is
optimal
2. Prefix match: "horrible" finds horriblesubs
3. Substring match? "アニメ" finds "TVアニメ"
4. Position-independent terms ("shingeki kyojin" finds the same as
"kyojin shingeki")
5. Works with short term lengths correctly and fast (no in "kyoukai no
kanata", 04 in "horrible shingeki 04" etc)
7. (nice to have) search negation: shingeki kyojin -horriblesubs

* Use match_all query instead of *, fix *
2017-06-01 08:38:29 +10:00
tomleb
360b35a08f Add reindexing every 5 minutes, and a bunch of other things (#852)
* Fix error messages with ES results

* Add lsof for debugging

* Add torrents table variable to index sukebei

* Use elasticsearch alias for hotswapping index

* Increase max open files, increase ES heap size

* Add reindex script and reindex triggers

We use a table to store the actions happened to the torrents table.
When the torrents table is INSERTED/UPDATED/DELETED, the trigger kicks
in and an entry is made to the reindex_torrents table.

The reindex_nyaapantsu.py script is then used to query the
reindex_torrents table and apply the correct reindex action to
elasticsearch. The entries are then removed for reindex_torrents table.

* Reindex every 5 minutes as cronjob
2017-05-30 21:22:12 -05:00
Eliot Whalan
5bcda5c9a1
Revert "Add playbook to download /build latest nyaa, fix k-on elasticsearch issue (#821)"
This reverts commit da1e323825.
2017-05-29 08:24:04 +10:00
tomleb
da1e323825 Add playbook to download /build latest nyaa, fix k-on elasticsearch issue (#821)
* Install nyaa from latest github commit

* Add install playbook, fix k-on search
2017-05-29 07:54:24 +10:00
tomleb
d6c50f5640 TorrentJSON.ID is uint now, fix weird page sorting (#769)
* TorrentJSON.ID is uint now, fix weird page sorting

The bug was that ES would sort by ID in a weird manner because the id
was a string. The id is now a uint.

* Resolved the conflict for future merging
2017-05-27 11:54:41 +10:00
tomleb
f22d11b35d Elasticsearch integration (WIP) (#730)
* Update mapping to be similar to TorrentJSON

* Implement ES search for TorrentParam

* Add seeders/leechers/completed to es index

* Fix filter, use analyzer

* Use ES for the search route

* Add upload_id filtering with ES

* Create/update ES index on torrent upload/update

* Delete from ES index on Delete

* Use ES everywhere, fallback to postgres query

Use Elasticsearch to search the index whenever a call to searchByQuery
is made. Big cleanup needed, but _it werks_.

* Only fetch ids from ES, nothing else

* Use ColumnUpdate instead of Save

* Add FIXME/info to search

* Template needs []TorrentJSON not []Torrent
2017-05-26 09:48:14 +10:00
tomleb
94af9997e0 Fix typo in elasticsearch_settings.yml file 2017-05-17 19:05:58 -04:00
tomleb
4609176785 Create main directory and add elasticsearch configuration 2017-05-17 19:05:58 -04:00