Small HTTP proxy (+ SOCKS)

Discussion in 'Tutorials and Guides' started by wlanboy, May 17, 2013.

Tags:
  1. wlanboy

    wlanboy Content Contributer

    2,126
    1,169
    May 16, 2013
    Sometimes it might be usefull to have a small http proxy running for a second browser window.

    My favorit http proxy is "polipo". It has a small footprint of 2MB.

    Installation is quite easy:


    apt-get install polipo

    The config files are stored in /etc/polipo

    There are two config files that should be altered:

    1. /etc/polipo/config
    2. /etc/polipo/forbidden
    The config file itself does include following should-be-set options:


    proxyAddress = 127.0.0.1 #Listening IP of the proxy server
    proxyPort = 8888 #Port of proxy server

    allowedClients = 127.0.0.1,10.10.10.0/24 #Comma separated list of allowed clients

    proxyName = "proxy" #HTTP header name of proxy

    dnsQueryIPv6 = no #query for IPv6 addresses

    tunnelAllowedPorts = 443,5656 #Comma separated list of SSL ports

    diskCacheRoot = "~/.polipo-cache/" #Folder for cache

    localDocumentRoot = "" #disable local webserver

    You can use cron+wget (+ http://adblockplus.org/en/subscriptions) to fill the file "/etc/polipo/forbidden" to add something like an adblocker to your http proxy.

    Ensure that polipo is not listening to an public available IP address. You really do not want to run a public proxy.

    Now you can enter "127.0.0.1:8888" as the proxy address to your browser settings.

    If you need more, e.g. an SOCKS proxy you can use the SOCKS server which is part of ssh:


    ssh -D 8080 yourvps.com

    Now you can enter "127.0.0.1:8080" to your proxy settings of your Browser.

    PS:
    If you are using Firefox ensure that you are enabling "network.proxy.socks_remote_dns" on your "about:config" page. If you want to use a SOCKS proxy your Firefox should use the DNS resolver of your vps too. I do not know why the default value is "disabled".
     
  2. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    Another good tutorial!!!

    Haven't played with polipo before so this will get me started.
     
  3. wlanboy

    wlanboy Content Contributer

    2,126
    1,169
    May 16, 2013
    An early version to convert an adblock rules to polipos format:

    1. Download e.g. easylist

      wget https://easylist-downloads.adblockplus.org/easylist.txt -O ~/adblock.txt

    2. Save Ruby script
      Code:
      nano ~/adblockconverter.rb
      
      With content:
       


      #!/usr/bin/ruby
      if ARGV.length == 0
      exit("Usage: ruby adblockconverter.rb <adblock file>")
      end

      adblockfilename = ARGV[0]

      if not File.exist?(adblockfilename)
      exit("The adblock file (#{adblockfilename}) does not exist!")
      end

      dollar_re = Regexp.new(/(.*?)\$.*/)

      File.open('./forbidden', 'w') do |polipofile|
      File.readlines(adblockfilename).each { | line |
      unless line.empty?
      if (["[", "!", "~", "#", "@"].include?(line[0]) or
      line[0, 8] == "/adverti" or
      line.include?("##"))
      next
      end
      line = line.gsub(dollar_re, "\\1")
      line = line.gsub("|", "")
      line = line.gsub("||", "")
      line = line.gsub(".", "\\.")
      line = line.gsub("*", ".*")
      line = line.gsub("?", "\\?")
      line = line.gsub("^", "[\\/:\\.=&\\?\\\\+\\-\\ ]+")
      polipofile.puts(line.strip)
      end
      }
      end


    3. Run converter
      Code:
      ruby adblockconverter.rb ~/adblock.txt
      
    4. Move the created "forbidden" file to /etc/polipo/forbidden

    If you do not know how to install Ruby check this tutorial.

    PS: cronjob doing the download/convert/overwrite polipo config/restart polio thing would look like:
     

    Code:
    30 23 * * * /usr/bin/wget http://easylist.adblockplus.org/easylist.txt -O /adblock/easy.txt \
    && /home/wlanboy/.rvm/rubies/ruby-2.0.0-p195/bin/ruby /adblock/adblockconverter.rb /adblock/easy.txt \
    && /bin/cp -f /adblock/forbidden /etc/polipo/forbidden && service polipo restart
    
     
    Last edited by a moderator: May 25, 2013
    HalfEatenPie likes this.
  4. willie

    willie Active Member

    760
    207
    May 24, 2013
    Openssh has a socks proxy built in.  Just say "ssh -D8888 yourserver" and ssh opens a tunnel to the other server (which could even be localhost afaik) and you can configure your browser to use it.  I use vps's as socks proxies this way all the time.
     
    Last edited by a moderator: May 25, 2013
    drmike likes this.
  5. wlanboy

    wlanboy Content Contributer

    2,126
    1,169
    May 16, 2013
    Your are right. Look to the buttom of my first post.
     
  6. Fritz

    Fritz New Member

    13
    2
    May 16, 2013
    Does Polipo supports PAM authentication?
     
  7. wlanboy

    wlanboy Content Contributer

    2,126
    1,169
    May 16, 2013
    I don't think that it is supported.

    There are currently two ways of autentification:

    1. allowedClients

      allowedClients = 127.0.0.1, 10.10.1.0/24

    2. authCredentials

      Code:
      authCredentials = username:password
      
     
  8. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    Well, finally took the polipo plunge.

    Polipo works, but is very flaky.    Incomplete page loads, images that are broken/bad and most troubling --- random crashes of the daemon.

    That's with the current version shipped in Debian 7.
     
  9. Nyr

    Nyr New Member

    113
    47
    May 16, 2013
    An alternative to Polipo is bouncer, a good old HTTP and SOCKS proxy:

    http://www.nagilum.net/bouncer/

    Unpack and run something like:


    ./bouncer--port 12345 --socks5
    You can use auth too:

    Code:
    ./bouncer--port 12345 --socks5 --s_user test --s_password test
     
    Last edited by a moderator: Aug 4, 2013
  10. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    Bouncer seems to have issues with modern sites per se.   Getting lots of incomplete page loads.

    Polipo was interesting for caching mainly.    Probably headed back to Squid since it has bunch of features, privacy features, etc.  It's a bear to config optimally, but works pretty good.

    I routinely use ssh tunneling, but it is barebones and not best solution for a shared environment where multiple people may be doing same things/overlapping.

    Real need/concern as centralizing blocking, ad ignoring, blocking bad networks, etc. become necessary in my world.  Sure lots of other folks are reading the news lately and concerned about their traffic, web use, etc.
     
  11. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    Still running polipo.... It still is crashing. No clue as to why. Utterly random about crashing also.

    Don't get it. Have it running in Virtualbox instance with way more resources than it needs. Tweaked ulimits to eliminate that common batch of issues. Still a problem.
     
  12. perennate

    perennate New Member Verified Provider

    387
    106
    May 15, 2013
    Any way to make the SSH SOCKS proxy better? At least when using Firefox, whenever one connection hangs everything else after it hangs too.

    I tried OpenVPN but then figured I don't want to proxy my SSH connections; they're laggy enough as is. Might try mosh for that though :)
     
    Last edited by a moderator: Aug 8, 2013
  13. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    Well still fussing with Polipo.   Getting somewhere with it, but still failing ugly at times.

    I've mainly stopped Polipo from hard crashing by enabling the high RAM options in the config file.

    Have problems with file descriptors though and Polipo hitting that ceiling and basically crashing.  Working on resolving that issue and submitting tickets to the program authors.   They need to deal with that situation far more gracefully.
     
  14. acd

    acd New Member

    176
    71
    May 16, 2013
    Or for people who C.B.A. to help the maintainers:


    #!/bin/bash
    trap "" 1
    while sleep 10;
    do
      if ! pgrep -F /var/run/polipo.pid > /dev/null
      then
        /etc/init.d/polipo restart > /dev/null 2>/dev/null
      fi
    done
    then background, and disown from bash and let it do its own thing.
     
    I didn't comment on this specifically because of the reasons buffalooed brought up. Polipo is flakey as hell, but it's MUCH nicer at running parallel requests than ssh's -D. Running anything with websockets over polipo also dies a fiery death (except wss, that usually works OK). I've been messing around with danted, but getting that to work has been less than thrilling and it doesn't offer the filtering features that wlanboy has described.

    fwiw, I guess authentication is OK, but it has the same caveats as http plain auth; passwords are sent in the clear. You're much better off requiring vpn access and then doing IP based limits. Then again if you have a vpn configured, there isn't really much reason for a proxy... right?
     
    Last edited by a moderator: Aug 16, 2013
    drmike likes this.
  15. rm_

    rm_ New Member

    47
    21
    May 15, 2013
    Can confirm this, polipo was very unreliable for me.

    I am just using squid currently. You configure it once, then keep the same config file and use it everywhere (e.g. on a new VPS that you also want to run a proxy). This is not difficult.
     
    Last edited by a moderator: Aug 16, 2013
  16. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    I am looking back again at Squid.  Polipo breaks with single user, so no way I would try it as-is with multiple users.

    @rm_, Squid, care to share a working config file you have had recent success with?  Any emphasis on privacy with your config (one of the reason Squid is downright appealing to me)?
     
  17. 365Networks

    365Networks New Member

    121
    38
    May 15, 2013
    Polipo ran fine for me however I prefer Squid myself, here is a working conf you can have:

    via off
    forwarded_for off
    #follow_x_forwarded_for deny all
    #=============START CONFIGURATION========
    #==============================================
    # TAG: http_port
    #==============================================
    http_port 3128 transparent
    icp_port 0
    server_http11 on

    #==============================================
    # TAG: hierarchy_stoplist
    #==============================================
    hierarchy_stoplist cgi-bin \? localhost
    acl QUERY urlpath_regex cgi-bin \? localhost
    no_cache deny QUERY

    #==============================================
    # OPTIONS WHICH AFFECT THE CACHE SIZE
    #==============================================
    cache_mem 8 MB
    maximum_object_size 50 MB
    maximum_object_size_in_memory 128 KB

    cache_swap_low 98%
    cache_swap_high 99%
    cache_replacement_policy heap LFUDA
    memory_replacement_policy heap GDSF

    ipcache_size 16384
    fqdncache_size 16384

    ipcache_low 98
    ipcache_high 99

    #==============================================
    # LOGFILE PATHNAMES AND CACHE DIRECTORIES
    #==============================================
    cache_access_log /var/log/squid/access.log
    cache_log none
    cache_store_log none

    mime_table /usr/share/squid/mime.conf

    # PID squid.
    pid_filename /var/run/squid.pid
    coredump_dir /home/ncode/cache/

    log_fqdn off
    log_icp_queries off
    buffered_logs off
    emulate_httpd_log off

    #==============================================
    # FTP section
    #==============================================
    ftp_list_width 32
    ftp_passive on
    ftp_sanitycheck on

    #==============================================
    # DNS resolution section
    #==============================================
    dns_nameservers 8.8.8.8 8.8.4.4

    #==============================================
    # Filesystem section
    #==============================================
    #diskd_program /usr/bin/diskd

    #==============================================
    # Refresh Rate
    #==============================================
    # refresh_pattern REGEX MIN_MINUTES VALIDITY(%) MAX_MINUTES
    refresh_pattern -i .(class|css|js|gif|jpg|ps)$ 1440 50% 43200
    refresh_pattern -i .(jpe|jpeg|png|bmp|tif)$ 1440 50% 43200
    refresh_pattern -i .(tiff|mov|avi|qt|mpeg|flv|ra|rm|wmv|divx)$ 1440 50% 43200
    refresh_pattern -i .(mpg|mpe|wav|au|mid|mp3|mp4|ac4|swf)$ 1440 50% 43200
    refresh_pattern -i .(zip|gz|arj|lha|lzh|7z)$ 1440 50% 43200
    refresh_pattern -i .(rar|tgz|tar|exe|bin|rpm|iso)$ 1440 50% 43200
    refresh_pattern -i .(hqx|pdf|rtf|doc|swf|xls|ppt|pdf|docx|xlsx)$ 1440 50% 43200
    refresh_pattern -i .(inc|cab|ad|txt|dll|dat)$ 1440 50% 43200

    refresh_pattern ^ftp: 1440 95% 12960 reload-into-ims
    refresh_pattern ^gopher: 1440 0% 1440
    refresh_pattern . 0 20% 4320

    quick_abort_min 0 KB
    quick_abort_max 0 KB
    quick_abort_pct 100%

    #==============================================
    # ACL section
    #==============================================
    acl all src 0.0.0.0/0.0.0.0
    acl manager proto cache_object
    acl localnet src PUTYOURIPHERE
    acl localhost src 127.0.0.1/255.255.255.255
    acl SSL_ports port 443 563 445 # https, snews
    acl Safe_ports port 80 81 # http
    acl Safe_ports port 21 # ftp
    acl Safe_ports port 443 563 # https, snews
    acl Safe_ports port 70 # gopher
    acl Safe_ports port 210 # wais
    acl Safe_ports port 1025-65535 # unregistered ports
    acl purge method PURGE
    acl CONNECT method CONNECT
    always_direct allow localnet localhost
    always_direct deny all
    http_access allow manager all
    http_access deny !Safe_ports
    http_access allow purge localhost
    http_access deny purge
    http_access allow localhost
    http_access allow localnet
    http_access deny all
    http_reply_access allow all
    icp_access allow all
    miss_access allow localnet
    miss_access deny all
    visible_hostname proxy
    header_access Accept-Encoding deny all

    #==============================================
    # MISCELLANEOUS
    #==============================================
    logfile_rotate 7
    negative_ttl 2 minute
    client_persistent_connections on
    server_persistent_connections on
    pipeline_prefetch on
    vary_ignore_expire on
    reload_into_ims on
    nonhierarchical_direct off
    prefer_direct off
    memory_pools off
    ie_refresh on
    cache_effective_user proxy
    cache_effective_group proxy

    #=============================================
    #Tag ZPH
    #=============================================
    zph_mode tos
    zph_local 0x30
    zph_parent 0
    zph_option 136
    #==========END OF CONFIGURATION=========
    Make sure you change your IP address 'acl localnet src PUTYOURIPHERE'
    I've had little issues with this conf, except for the odd time it won't connect to a server via SSH.
     
    peterw and drmike like this.
  18. drmike

    drmike 100% Tier-1 Gogent

    8,573
    2,717
    May 13, 2013
    I am certainly headed back to Squid too.

    Polipo has been better behaved, but still has mass issues.

    Latest fun thing, the cache filled up the drive.

    Polipo doesn't maintain a cache inventory/index.   So if I keep running it have to cron schedule:


    polipo -x

    That purges the cache, however doesn't seem to be complete.  Unsure what criteria it uses to determine purgeables.
     
  19. peterw

    peterw New Member

    800
    189
    Jun 14, 2013
    I have added following lines to handle local stuff:

    Code:
    acl DIRECTS domain1.com domain2.com .lan
    never_direct allow all
    always_direct allow DIRECTS
    
     
    drmike likes this.