1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Logs, DBs & Back-ups - Your Strategies?

Discussion in 'Apache' started by T0PS3O, Jan 10, 2006.

  1. #1
    With Urchin requiring almost full on Apache log keeping (almost all variables) combined with decent traffic, I can see my access files grow. One of them is 2.2GB within just 5 months.

    This caused me to go over the back-up transfer limit and made me wonder what I actually should be backing up and how.

    What do you guys & girls do?

    Just not back-up access logs or just once a month? They'll end up being extraordinarily large and in Urchin I'd normally only look at the current or last month. Is there a way in Apache on RHEL to split the access log in two, like one to archive with everything older than 3 months (can be cut and pasted to tape as well) and one to use for Urchin (stats up to 3 months or so) which gets backed-up regularly. That way file sizes stay managable.

    Same counts with (MySQL) databases. Backing up the same old stuff is quite silly really, are there back-up tools that can help splitting what you need to back up?

    How do you manage such vast volumes of data?
     
    T0PS3O, Jan 10, 2006 IP
  2. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #2
    I have about 10GB+ of data backed up via scp over the big internet every night. My servers run a cron'd script to tar/zip it all up, rotate it with the past days backups, and upload the current backup to my cable modem at home(5Mbps download, so top speed). It's much larger than I'd like, but I don't want to spend time archiving access logs and the like just yet...

    As for breaking up access logs, many typical setups would use a simple logrotate setup to create one log file per day. Then, you simply delete the old files if you don't care about them anymore. Urchin also has a way to purge the old data, I believe...


    --
    Derek
     
    dkalweit, Jan 10, 2006 IP
    T0PS3O likes this.
  3. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks! I wasn't aware of terms like log rotating so I'll check it out. Will also check Urchin's documents. Thanks for the ideas.

    I am contemplating having a dedicated simple Linux box rsynch'ing every day, similar to your setup. But pulling 3GB and growing every day might piss off our ISP.

    Backing up is an artform in its own right it seems.
     
    T0PS3O, Jan 10, 2006 IP
  4. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #4
    My ISP gives me 2000GB/month, and I rarely even hit 500GB, even with these backups(per server). I do use rsync for syncing my access logs-- I have one dedicated server that I use for all my Urchin stats, and have it rsync(via ssh) to the other servers and get the changes in the access log files. It works quite well and keepsnetwork traffic to a minimum-- and I can do it hourly(my Urchin interval) with no problems... I run Urchin on my linux ded. server, and it actually rsync's with my Windows servers, which are running cygwin's SSH server and shared keys... Works quite nicely...


    --
    Derek
     
    dkalweit, Jan 10, 2006 IP
  5. blinxdk

    blinxdk Peon

    Messages:
    660
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Also remember that you can use compression both on rsync and scp, that should drasticly lower the bandwidth used (rsync -z and scp -C)
     
    blinxdk, Jan 10, 2006 IP
  6. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #6
    Indeed. I use rsync compression when transfering log files, but scp compression can't compress a compressed tarball that much further, and can sometimes needlessly add to the overhead considering its size...


    --
    Derek
     
    dkalweit, Jan 10, 2006 IP
  7. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #7
    That's a great idea, just do the whole Urchin analysis on a different machine, could even be locally I guess. That way the access log will never grow bigger than whatever it fills during the interval you set to rsync it down locally.

    Same with data from the database that is only for lookup/analysis, not for re-use/manipulation.

    Food for thought, thanks.

    I guess once you move from virtual hosting to your first dedicated box, the 2nd, 3d and more aren't far off anymore.
     
    T0PS3O, Jan 10, 2006 IP
  8. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #8
    As long as the income exists to cover for a ded. server(plus profit), I think a ded. server is pretty much a necessity... My 2nd and 3rd game pretty quick thereafter. I might be able to drop the 3rd and deal with just 2(due to Jagger hitting so hard), but then I'd have to spend time to move things over, which means money lost, so I haven't gotten around to it..


    --
    Derek
     
    dkalweit, Jan 10, 2006 IP
  9. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Already have one, with dedicated firewall and external back-up. But you know hosts, they like charging excessively when you go over your allowance. Downloading the access files cut 'n paste style will prevent the necessary back-up from clogging up.
     
    T0PS3O, Jan 10, 2006 IP
  10. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #10
    You must have considerably bandwidth or a cheap host to be able to go over your bandwidth so easily. Most ded. servers these days-- even the cheap ones-- offer 1000GB/mo+...


    --
    Derek
     
    dkalweit, Jan 10, 2006 IP
  11. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #11
    100gb a day 'external' (website traffic etc) but the backup bandwidth is on a separate quota.
     
    T0PS3O, Jan 10, 2006 IP
  12. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #12
    Ah-- just backup external... ;-) Better to have an offsite backup anyways...


    --
    Derek
     
    dkalweit, Jan 10, 2006 IP