$_SERVER['HTTP_USER_AGENT'] - Break it down with PHP

Discussion in 'PHP' started by adzeds, Feb 3, 2010.

  1. #1
    Does anyone know where I can find a script that will break down all the elements of the $_SERVER['HTTP_USER_AGENT'] so that I can save it in my database?

    Needs to use PHP. Hope someone can help!
     
    adzeds, Feb 3, 2010 IP
  2. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #2
    SmallPotatoes, Feb 3, 2010 IP
  3. adzeds

    adzeds Well-Known Member

    Messages:
    1,209
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    100
    #3
    I am looking for a script that will pull out browser, operating system and all the other parts so that they can be stored in a database seperatly.

    Hope someone can help!
     
    adzeds, Feb 4, 2010 IP
  4. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #4
    SmallPotatoes, Feb 4, 2010 IP
  5. elias_sorensen

    elias_sorensen Well-Known Member

    Messages:
    852
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    110
    #5
    And that will get_browser do for you.

    Takes the user agent string and creates an array with those specs.
     
    elias_sorensen, Feb 4, 2010 IP
  6. beacon

    beacon Peon

    Messages:
    93
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6
    php can see only: http headers
    on them can be found only: browser, language and may be operating system
    header example:
    GET /safebrow HTTP/1.1
    Host: www.google.com
    User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.4) Gecko/20091016 Firefox/3.5.4
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-us,ru-ru;q=0.8,ru;q=0.5,en;q=0.3
    Accept-Charset: utf-8,*;q=0.1
    Keep-Alive: 300
    Connection: keep-alive
    
    Code (markup):
     
    beacon, Feb 4, 2010 IP
  7. adzeds

    adzeds Well-Known Member

    Messages:
    1,209
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    100
    #7
    Some web stats programs can collect information such as Internet connection speed. How do they do this?
     
    adzeds, Feb 4, 2010 IP
  8. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Mostly they guess.

    There are databases out there that purport to contain mappings between IP addresses and various connection attributes. The more accurate these databases are, the more they cost. The free ones are pretty awful.

    It's also possible with Flash or Javascript to gather additional information about the client. These are intrusive processes and can add some delay to page display.
     
    SmallPotatoes, Feb 4, 2010 IP
  9. krsix

    krsix Peon

    Messages:
    435
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #9
    get_browser() will throw it all into an array for you. Just use that.

    As for connection speed, this is purely best-guess, but use gethostbyaddr() then check it for words like:

    ipt.aol
    dial/dun/ras
    cable/cpe/cbl/hsi
    dsl/adsl/highspeed/hsi
    isdn/pri

    etc.

    You can autosort just with the 2nd level domain: rr.com is 99% cable, fios.verizon and anythingelse.verizon is fibre and then ADSL, comcast is 99% cable, comcastbusiness is 99% cable, etc.
     
    krsix, Feb 4, 2010 IP
  10. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #10
    get_browser() is not available on all servers (namely shared) due to:

    $_SERVER['HTTP_USER_AGENT'] returns a Array element, theirfore to brake it up you could possibly use a regular expression.
     
    danx10, Feb 6, 2010 IP
  11. krsix

    krsix Peon

    Messages:
    435
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #11
    If your host won't load such an easy-to-install thing, they really shouldn't be in web hosting.
     
    krsix, Feb 6, 2010 IP
  12. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #12
    Maybe - but I believe dreamhost.com (which is an estabilished web-hosting site) shared doesn't have this enabled.
     
    danx10, Feb 6, 2010 IP
  13. krsix

    krsix Peon

    Messages:
    435
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Submit a ticket asking for it to be installed - I actually have a DH account and they have always installed any extension or plugins (PHP/Perl/etc) I needed.
     
    krsix, Feb 6, 2010 IP
  14. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #14
    I think we're going slightly offtopic :p -. But I'm not with DH, just was trying to make a point - that you won't always be in an environment where you can simply submit a ticket or configurate something yourself - so in the long-run it can potentially cause problems.
     
    danx10, Feb 6, 2010 IP
  15. krsix

    krsix Peon

    Messages:
    435
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Yeah, I guess. Back on track - why manually regex out the entire thing with a hacked-together script [you won't get as many browsers, as much stats as with ->] when there's an (almost)-native function for it?
     
    krsix, Feb 6, 2010 IP
  16. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #16
    Well the question was "$_SERVER['HTTP_USER_AGENT'] - Break it down with PHP"

    Yes theirs get_browser(), but was just offering a possible alternative method.
     
    danx10, Feb 6, 2010 IP
  17. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #17
    I believe you can get your own browscap.ini file, store it in your own directory, and use ini_set to point PHP to it at runtime. Then get_browser() should work.
     
    SmallPotatoes, Feb 6, 2010 IP
  18. joebert

    joebert Well-Known Member

    Messages:
    2,150
    Likes Received:
    88
    Best Answers:
    0
    Trophy Points:
    145
    #18
    The first thing you want to have when trying to duplicate the functionality of an existing PHP function for a system that doesn't have that function available, is a copy of the PHP source code.

    Here's the actual definition of get_browser from ./ext/standard/browscap.c:284

    /* {{{ proto mixed get_browser([string browser_name [, bool return_array]])
       Get information about the capabilities of a browser. If browser_name is omitted
       or null, HTTP_USER_AGENT is used. Returns an object by default; if return_array
       is true, returns an array. */
    
    PHP_FUNCTION(get_browser)
    {
    	zval **agent_name = NULL, **agent, **retarr;
    	zval *found_browser_entry, *tmp_copy;
    	char *lookup_browser_name;
    	zend_bool return_array = 0;
    	char *browscap = INI_STR("browscap");
    
    	if (!browscap || !browscap[0]) {
    		php_error_docref(NULL TSRMLS_CC, E_WARNING, "browscap ini directive not set");
    		RETURN_FALSE;
    	}
    
    	if (ZEND_NUM_ARGS() > 2 || zend_get_parameters_ex(ZEND_NUM_ARGS(), &agent_name, &retarr) == FAILURE) {
    		ZEND_WRONG_PARAM_COUNT();
    	}
    
    	if (agent_name == NULL || Z_TYPE_PP(agent_name) == IS_NULL) {
    		zend_is_auto_global("_SERVER", sizeof("_SERVER")-1 TSRMLS_CC);
    		if (!PG(http_globals)[TRACK_VARS_SERVER]
    			|| zend_hash_find(PG(http_globals)[TRACK_VARS_SERVER]->value.ht, "HTTP_USER_AGENT", sizeof("HTTP_USER_AGENT"), (void **) &agent_name)==FAILURE) {
    			php_error_docref(NULL TSRMLS_CC, E_WARNING, "HTTP_USER_AGENT variable is not set, cannot determine user agent name");
    			RETURN_FALSE;
    		}
    	}
    
    	convert_to_string_ex(agent_name);
    	lookup_browser_name = estrndup(Z_STRVAL_PP(agent_name), Z_STRLEN_PP(agent_name));
    	php_strtolower(lookup_browser_name, strlen(lookup_browser_name));
    
    	if (ZEND_NUM_ARGS() == 2) {
    		convert_to_boolean_ex(retarr);
    		return_array = Z_BVAL_PP(retarr);
    	}
    
    	if (zend_hash_find(&browser_hash, lookup_browser_name, strlen(lookup_browser_name)+1, (void **) &agent)==FAILURE) {
    		found_browser_entry = NULL;
    		zend_hash_apply_with_arguments(&browser_hash, (apply_func_args_t) browser_reg_compare, 2, lookup_browser_name, &found_browser_entry);
    
    		if (found_browser_entry) {
    			agent = &found_browser_entry;
    		} else if (zend_hash_find(&browser_hash, DEFAULT_SECTION_NAME, sizeof(DEFAULT_SECTION_NAME), (void **) &agent)==FAILURE) {
    			efree(lookup_browser_name);
    			RETURN_FALSE;
    		}
    	}
    
    	if (return_array) {
    		array_init(return_value);
    		zend_hash_copy(Z_ARRVAL_P(return_value), Z_ARRVAL_PP(agent), (copy_ctor_func_t) zval_add_ref, (void *) &tmp_copy, sizeof(zval *));
    	}
    	else {
    		object_init(return_value);
    		zend_hash_copy(Z_OBJPROP_P(return_value), Z_ARRVAL_PP(agent), (copy_ctor_func_t) zval_add_ref, (void *) &tmp_copy, sizeof(zval *));
    	}
    
    	while (zend_hash_find(Z_ARRVAL_PP(agent), "parent", sizeof("parent"), (void **) &agent_name)==SUCCESS) {
    		if (zend_hash_find(&browser_hash, Z_STRVAL_PP(agent_name), Z_STRLEN_PP(agent_name)+1, (void **)&agent)==FAILURE) {
    			break;
    		}
    
    		if (return_array) {
    			zend_hash_merge(Z_ARRVAL_P(return_value), Z_ARRVAL_PP(agent), (copy_ctor_func_t) zval_add_ref, (void *) &tmp_copy, sizeof(zval *), 0);
    		}
    		else {
    			zend_hash_merge(Z_OBJPROP_P(return_value), Z_ARRVAL_PP(agent), (copy_ctor_func_t) zval_add_ref, (void *) &tmp_copy, sizeof(zval *), 0);
    		}
    	}
    
    	efree(lookup_browser_name);
    }
    /* }}} */
    PHP:
    If you look further through that file you'll see what it's basically doing is loading the browscap.ini file using the equivalent of parse_ini_file and looking the user-agent up in it.

    That's not exactly how it works though, if you look through some of the entries in the example browscap.ini files php.net links to you'll see that they're not exact matches to user-agent strings, the browscap versions employ wildcards in some of them to cover multiple minor versions of browsers where the corresponding values don't change to keep the hash table small.

    If you look further through the file you'll see that the get_browser function loops through the loaded ini file and creates regular expression patterns out of the browser entries with this function.

    /* {{{ convert_browscap_pattern
     */
    static void convert_browscap_pattern(zval *pattern)
    {
    	register int i, j;
    	char *t;
    
    	php_strtolower(Z_STRVAL_P(pattern), Z_STRLEN_P(pattern));
    
    	t = (char *) safe_pemalloc(Z_STRLEN_P(pattern), 2, 3, 1);
    
    	t[0] = '^';
    
    	for (i=0, j=1; i<Z_STRLEN_P(pattern); i++, j++) {
    		switch (Z_STRVAL_P(pattern)[i]) {
    			case '?':
    				t[j] = '.';
    				break;
    			case '*':
    				t[j++] = '.';
    				t[j] = '*';
    				break;
    			case '.':
    				t[j++] = '\\';
    				t[j] = '.';
    				break;
    			default:
    				t[j] = Z_STRVAL_P(pattern)[i];
    				break;
    		}
    	}
    
    	t[j++] = '$';
    
    	t[j]=0;
    	Z_STRVAL_P(pattern) = t;
    	Z_STRLEN_P(pattern) = j;
    }
    /* }}} */
    PHP:
    So basically what you would need to do, is load a browscap.ini file into your own array with parse_ini_file, then loop through every one of the keys in that array, converting each to a regular expression, and then apply that regular expression to your HTTP_USER_AGENT to see if it matches.

    If it matches, you return the entry in your browscap array that corosponds to the key you generated a regular expression pattern from.

    Make sense ? :)
     
    joebert, Feb 6, 2010 IP