1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Combined DB technologies?

Discussion in 'Databases' started by datomtom, Aug 25, 2019.

  1. #1
    Dear forum – yet another DB decision question. Thanks in advance for any good advice.

    So, I am setting up a database for the "media archive" of our company. It includes metadata about (1) print data (such as press clippings, book reviews, marketing documents, books and brochures we publish) and (2) audio-visual data (basically everything with a timeline, such as digitized videos and audio files, DVDs and CDs). So these two build the main data tables and there's a couple of other smaller tables with dependencies. We've developed strategies for unique signatures and flexibility to scale up the system later on in terms of data types. At the moment I am going the classic php / mysql (maria db) stack way for doing this. But it's at an early stage, where we basically store and edit the data in a spreadsheet driven system for the moment for importing into it via CVS later on. The final system will be hosted on a Linux root server at an external provider that I set up on my own – so I am open to install whatever technology I want. It's important for me to stick with open source solutions.

    Now, especially for (1) above, I am looking at a way to make even the the full text of our published books searchable. Let's say we have 100 books, with every book in the range of 120,000 words. So we are talking about a total of 12,000,000 words at the moment. (In fact a lot more for the full text search, since the books are just one part among many other items in the database.) From what I found out so far, it would be a bad idea performance-wise to store the full text of the books in a RDBMS generally, and in a MEDIUMTEXT or LONGTEXT field of a MYSQL database in particular and then search within it. So here are the upcoming questions:

    (a) What DB system would be a good option for this kind of task in general?

    (b) Would it make sense to "combine" DB systems, such as keeping all the metadata in MYSQL and just "connect" it to another DB system (like a NonSQL / text driven one) that's just for storing the huge full text?

    (c) With all these options I would like to keep it practical for the queries - in the best case I sticking with PHP statements (using PDO) for the queries. And of course get some good performance for the search results. This is what it's all about.

    Again, thanks for sticking with my request until here and any advice.
     
    datomtom, Aug 25, 2019 IP
  2. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,998
    Best Answers:
    253
    Trophy Points:
    515
    #2
    In a case like that I'd suggest the assistance of a fulltext search that works with your existing DB. An actual "search engine".

    It's been years, but last time I had something like this to deal with, I used Sphinx.

    http://sphinxsearch.com/

    It can make fulltext indexes from your existing databases, allowing for parallel operation where it handles the search, but spits out the related ID's for your DB tables.
     
    deathshadow, Sep 14, 2019 IP
  3. datomtom

    datomtom Greenhorn

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #3
    Awesome. Thank you @deathshadow ... this is very helpful indeed. I will give this a try.
     
    datomtom, Sep 16, 2019 IP