1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Removing empty space from database

Discussion in 'Databases' started by JEET, Sep 9, 2021.

  1. #1
    Removing empty space from database

    Let me explain the situation.
    Suppose I create a new table with an auto_increment column, and I make 3 new inserts in it.

    Table: id, name
    1, john
    2, Alex
    3, jeet

    Then I delete the last record and create a new one.
    Now, mysql will remove the entry at record id "3", but it will not add the new record at id "3".
    The new record will be added at record ID "4".
    So table will look like this:

    1, john
    2, Alex
    4, Sam

    I am thinking that if lots of deletes happen in a table, then lots of such empty spaces will get created in the database overtime, which would slow down the "select" queries eventually.
    Is that correct?

    How can I fix this issue?

    I tried exporting the whole table and then rebuilding it.
    But the export dump has the record ID field in it.
    So even after exporting and importing, the table still looks like this:

    1, john
    2, alex
    4, sam

    I am not sure if the null record ID "3" is still present or not...

    optimize table is not working either. It takes lot of time, and PHP time_limit expires or mysql goes away while doing that.

    Any ideas?
    Thanks
     
    Solved! View solution.
    JEET, Sep 9, 2021 IP
  2. sarahk

    sarahk iTamer Staff

    Messages:
    28,500
    Likes Received:
    4,460
    Best Answers:
    123
    Trophy Points:
    665
    #2
    The deleted rows don't exist magically, they're gone forever.

    However databases like MySQL do have an optimise function which gets rid of any traces of deleted rows, tidies up indexes etc.
     
    sarahk, Sep 9, 2021 IP
    JEET likes this.
  3. mmerlinn

    mmerlinn Prominent Member

    Messages:
    3,197
    Likes Received:
    818
    Best Answers:
    7
    Trophy Points:
    320
    #3
    I don't know mysql, but most databases have a method to PACK the database and remove records marked for deletion.
     
    mmerlinn, Sep 9, 2021 IP
    JEET likes this.
  4. c1lonewolf

    c1lonewolf Member

    Messages:
    57
    Likes Received:
    21
    Best Answers:
    1
    Trophy Points:
    33
    #4
    I call it row recycling.
    What you're looking to do is replace the deleted information with new information without adjusting the db's index number for that table. So you have to create a default string to go into any entries that are removed such as
    posted, updated, name
    0000.00.00, 0000.00.00, name (default entry settings)
    Now when you go to remove an entry you simply replace the column values with the default settings. When you add a new entry you search for the first entry containing the default values, then replace these with the new ones.

    Note this is only to be used when this is what you "want" to do. For database tables that have large gaps the setup for this method is a lil' different. You must also understand the pitfalls.
     
    c1lonewolf, Sep 19, 2021 IP
    JEET likes this.
  5. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #5
    @c1lonewolf
    Yes, this is what I am talking about.
    However, I was hoping there would be a way to force mysql into doing this during insert, instead of me doing "select plus update" instead of "insert"...
    What is the process for database with large gaps, and what are the pitfalls? (other than the double query per insert)

    Multiple insert in one query will not be possible in this way.
    Like:
    insert into table (name) values
    ('john'),('jeet')...etc

    Do those gaps even slow down select queries, or am I just assuming this?
    Suppose there are 10 million records in database table.
    A couple thousand get deleted everyday, and a couple thousand get added.
    Those new records are not replacing old records. They are added new, at the bottom of the table.
    Is this table actually working slower at "select" query, or is it just my assumption?

    Thanks
     
    JEET, Sep 21, 2021 IP
  6. sarahk

    sarahk iTamer Staff

    Messages:
    28,500
    Likes Received:
    4,460
    Best Answers:
    123
    Trophy Points:
    665
    #6
    The speed of your select is determined by the indexes - not the gaps in id numbers
     
    sarahk, Sep 21, 2021 IP
    JEET likes this.
  7. #7
    @JEET
    1. There is no gap in your data table, it's all in your head.

    2. "I was hoping there would be a way to force mysql into doing this during insert" - Jeet
    That would go against the laws of thermal dynamics (or at least datbase coding anyway).
    Insert - means to add
    Update - means to replace
    Delete - means to remove

    These are the basic functions. Attempting to use insert to update entries based on a gap that doesn't even exist would only cause problems.

    And my answer wasn't meant for implementing multiple inserts it was meant to handle one at a time. And if I remember correctly you can use arrays where values are concerned...
    
    insert into table (name) where id="[1,5,7]" valuse="[john,jeet,jester]"
    
    Code (markup):
    or something like that.

    "Suppose there are 10 million records in database table. A couple thousand get deleted everyday, and a couple thousand get added. Those new records are not replacing old records. They are added new, at the bottom of the table." - Jeet
    Which is why most people who have extremely large data tables don't worry about the numbers. When a record is deleted the entire row is removed from that table. When a new record is created it is added (appened) to the table. Incrementing id's only apply to the row itself.
    ---------------------------------------------
    For those that want to get a better understanding...
    I call this the "Gap Delusion" where you're brain thinks there's a problem because your id numbers are not sequential, when in reality there's nothing wrong, because the id numbers only apply to the entry itself not the number of entries in the table.

    1. Gaps in id numbers "do not" slow down processing.
    The auto increment value for a table column uses a counter. The default for this counter starts at "0" unless you change it to something else like "600". The counter increments each time a new entry is created. The counter only keeps track of how many entries have been added since the table was created. And yes, the counter can be reset.

    The id value of the entry is only a link. In a relational database you need a way to connect pieces of information that might be scattered across multiple data tables, you do that by creating an entity that is the same in all. I've used the example below before (without the names if course):

    
    Users:(could be login info)
    User_ID | User_Name
    1 | john
    2 | jeet
    
    User_Info:
    User_ID | User_FirstName | User_LastName
    1 | john | smith
    2 | jeet | unknown
    
    
    User_Channels:
    Chann_ID | Chann_Name | User_ID
    1 | Understanding Datbases | 2
    4 | Experimenting With Code | 2
    
    
    User_Channel_Subscriptions:
    User_ID | Chann_ID | Chann_Name
    1 | 1 | Understanding Datbases
    2 | 1 | Understanding Datbases
    
    
    Code (markup):
    Notice each table deals with something different, but all have the user id in common. In the subscription table the "channel id" simply connects each individual with channels that were created by members. This also allows you to use the "JOIN" method to gather all the information needed on one individual for output.

    2. Query result speeds are based on three factors:
    a) How fast the server posts the request.
    b) The overall number of entries in the table (tables with 10 million records are going to process slower than one with 1 hundred or even just 10.
    c) The amount of time it takes sql to search through all the entries and post relative entries to an array and send it back to you.

    And yes with the record replacement method there are advantages and disadvantages that's why it's considered an option, not a requirement. It's a personal choice.

    Speed
    There are slight speed disadvantages when having to run extra queries. Do you delete the item all together and move on or replace the row's information leaving the id intact? Do you simply insert a new record or look for a record that's information can be replaced. The choice depends on how worried you are about an imaginary gap.

    What To Replace
    Only the important stuff. In the example above, the personal and login information of the former id holder would be replaced with either the default values or the info provided by a new member. Any channels or subscriptions used by the former id holder would be deleted completely allowing the new id holder to make their own selections when and if they choose too.

    Marketing
    When an item is deleted from your data table there's a reason for it. Let's use a product table as an example. You may have deleted an item because it was discontinued, you changed your product line, there are so many reasons why you could've removed the entry, but what about all those links out there across the net that you've never seen.

    When you pull up an item on your website it's usually done using the 'id'. Over the years you've had a lot of people visiting and sending out links to specific items to friends, family etc. and you don't even know how many links are out there just waiting to be clicked. They can be in forums or blogs you name it and when those links get clicked all they're going to get is "Item Not Found!". Now you can easily justify that by saying something like, "well they're on the site and they might look around and buy something." Most won't, but there was something about that link that made them want to click it and because of that you have an opportunity to enhance the user experience!

    So the question becomes, do you delete and speed things up, possibly losing new clients or sales, or do you replace and enhance the visitors experience and possibly gain new clients and increase your sales? There's a lil' more to this, but you'll have to figure it out.

    If your code uses more than one "print" or "echo" statement, chances are your website loads slower than any query ever will.
    C1lonewolf
     
    c1lonewolf, Sep 22, 2021 IP
    JEET and sarahk like this.
  8. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #8
    @c1lonewolf
    Thanks for explaining this.
    My only concern was if gaps slow down queries, and since there are no gaps, then its no problem.

    Thankyou @sarahk and @mmerlinn your suggestions also helped me. Thanks
     
    JEET, Sep 24, 2021 IP
  9. laci272

    laci272 Well-Known Member

    Messages:
    163
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    118
    #9
    Maybe you are looking for this?
    OPTIMIZE TABLE `table_name`;
    Code (markup):
    I wouldn't worry much about empty space, though, unless you're table is over 100 GB or so.
     
    laci272, Feb 15, 2023 IP
    JEET likes this.
  10. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #10
    @laci272
    Thanks, I used to use optimize, but on large tables its like taking forever to finish. That's why I stopped doing optimize table.
     
    JEET, Feb 16, 2023 IP
  11. laci272

    laci272 Well-Known Member

    Messages:
    163
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    118
    #11
    There is a trick you can do detailed here: https://www.percona.com/blog/mysql-optimize-tables-innodb-stop/
    But alas, I would not worry about spaces in the table, unless your table is really huge and looks like swiss cheese:)
     
    laci272, Feb 16, 2023 IP
    JEET likes this.
  12. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #12
    Thanks, I will look into it.
     
    JEET, Feb 16, 2023 IP