Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 21

Thread: Any reason not to go "transparent"?

  1. #11
    Very good friend of the forum Virchanza's Avatar
    Join Date
    Jan 2010
    Posts
    863

    Default

    The non transparent proxy server works because the client application (web browser usually) is configured to forward its requests for web sites (no matter what port they are running on) to the listening port of the proxy server. The proxy service then forwards those requests on.
    I'm with you.

    The proxy, whether it is transparent or non transparent, only ever has to listen on one port (its TCP 3182 for squid by default, other proxies use other default ports like TCP 8080). When in transparent mode, a filtering process (e.g. Linux netfilter) that sees the traffic changes the destination port number of the traffic to the port that the transparent proxy is running on, and changes the mac address of the frame to the mac of the proxy server (this assumes that the proxy and filter are on different systems, it works slightly differently otherwise). The proxy service then receives IP packets with a destination address of the destination web site, on the TCP port that the proxy server is listening on. The proxy then knows how to forward the traffic on because it has the IP address of the destintation site from the IP packet header. It has to assume that the port number is 80 because the original port number is lost during the translation.
    I have to say I was very disappointed when I first heard that that's how Squid works. Instead of listening on just one port, Squid should be fully integrated into the whole "IP forwarding system", meaning that it should be watching out for incoming IP packets whose destination IP address is different from the actual machine's IP address. That way, it would be able to deal with any protocol on any port (eg. telnet over port 778).

    When a non transparent proxy receives a request, the client sends the request directly to the IP address and port of the proxy server that it is configured to use, and the HTTP request header is modified to include the full URL of the destination site so that the proxy knows how to forward the traffic. The full URL will include the port number of the site if it is running on a non standard port. Note that regular "direct" HTTP requests don't include the full URL in the HTTP header, so the transparent proxy does not get the same option.
    Ok now I see what you're getting it. If I understand correctly, you're saying the following:

    1) For a transparent proxy server, trying to load "www.abc.com:18/" will result in a HTTP GET request of simply "www.abc.com".
    2) For a non-transparent proxy server, trying to load "www.abc.com:18/" will result in a HTTP GET request of "www.abc.com:18/". (i.e. it includes the port number). The proxy server will then remove this port number from the URL for actually fetching the webpage from the remote server. Right?

    Now I can see how a non-transparent proxy server might be preferable, thanks for that.

    You know what though... I think it's a major flaw that current transparent proxy servers can't work with the entire "IP forwarding system". I have a think about contacting the guys who make Squid and see what they think of integrating it into the "IP forwarding system".

    I think the way transparent proxying works is much uglier, but its a matter of opinion I guess.
    If the transparent proxy server was to simply just listen out for traffic that's destined for a foreign IP, then it'd be the perfect setup.

    Right now, I'm actually on a network that has a non-transparent proxy server for accessing the internet. Just a few minutes ago, I wanted to sync my computer clock to an internet time server... but of course I was gonna have a problem because of the proxy server, I'd have to enter the proxy information somewhere. So anyway I wasn't bothered spending more than 11 seconds on it so my clock's still wrong. I'll sort it out when I'm on a "proper" network.
    Ask questions on the open forums, that way everybody benefits from the solution, and everybody can be corrected when they make mistakes. Don't send me private messages asking questions that should be asked on the open forums, I won't respond. I decline all "Friend Requests".

  2. #12
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Quote Originally Posted by Virchanza View Post
    I have to say I was very disappointed when I first heard that that's how Squid works. Instead of listening on just one port, Squid should be fully integrated into the whole "IP forwarding system", meaning that it should be watching out for incoming IP packets whose destination IP address is different from the actual machine's IP address. That way, it would be able to deal with any protocol on any port (eg. telnet over port 778).
    This would probably need to be integrated into the IP stack to work effeciently, which would mean kernel integration, and which would make it harder to have the proxy being portable across Operating Systems (even amongst Unix like Operating Systems, like Linux, BSD, etc). Its starting to get more into the application firewall space rather than the proxy space. There are some NetFilter extensions for Linux that are going in that direction, but they are not as functional as a proxy just yet.

    Quote Originally Posted by Virchanza View Post
    Ok now I see what you're getting it. If I understand correctly, you're saying the following:

    1) For a transparent proxy server, trying to load "www.abc.com:18/" will result in a HTTP GET request of simply "www.abc.com".
    2) For a non-transparent proxy server, trying to load "www.abc.com:18/" will result in a HTTP GET request of "www.abc.com:18/". (i.e. it includes the port number). The proxy server will then remove this port number from the URL for actually fetching the webpage from the remote server. Right?
    Close.

    I grabbed the GET requests for a visit to www.microsoft.com from the HTTP headers for two different web requests, one using a proxy and the other not using a proxy (which are the type of requests a transparent proxy gets)

    No proxy (or transparent proxy):
    GET / HTTP/1.1

    Proxy:
    GET http://www.microsoft.com/ HTTP/1.1

    See how only the requests with a proxy configured include the hostname and protocol in the URL? This gives the proxy enough information on how to connect to the required destination host, tells it what protocol to use (http or https) and also allows alternate ports to be specified (e.g. GET http://www.satan.com:666/bad/evil.html HTTP/1.1)

    The hostname is not specified in the request without a proxy, because the browser believes it is making a direct TCP connection to the destination web server, so there is no point in specifying the complete URL in the GET request. The transparent proxy has to determine the destination system by using the destination IP address in the packets it receives, and it has to assume the port because it cant get that from the TCP header because its been modified to allow the traffic to be redirected to the proxy service.

    Quote Originally Posted by Virchanza View Post
    You know what though... I think it's a major flaw that current transparent proxy servers can't work with the entire "IP forwarding system". I have a think about contacting the guys who make Squid and see what they think of integrating it into the "IP forwarding system".
    I don't know if it necessarily qualifies as a flaw because I don't think it was ever a design goal for the Squid server to work in that way. I'm fairly sure it was designed to primarily work as a normal proxy first, based on a model where it worked as a listening TCP service, and only proxied traffic like SOCKS, http, https, gopher and FTP, and the transparent features for http were added later to provide some additional useful functionality.

    Making it work with the IP forwarding system is much more complicated, would be much harder to maintain portability for, has the potential to slow down all traffic, would require it to understand all possible higher level protocols, etc, etc. It would definitely be useful, but probably not that easy to actually implement.

    Check with the Squid devs if you wish however, they may be able to provide more info...

    Quote Originally Posted by Virchanza View Post
    Right now, I'm actually on a network that has a non-transparent proxy server for accessing the internet. Just a few minutes ago, I wanted to sync my computer clock to an internet time server... but of course I was gonna have a problem because of the proxy server, I'd have to enter the proxy information somewhere. So anyway I wasn't bothered spending more than 11 seconds on it so my clock's still wrong. I'll sort it out when I'm on a "proper" network.
    You would probably find that NTP/SNTP wouldn't work through a non transparent proxy anyway. Most usually only support some combination of HTTP/HTTPS/FTP/SOCKS/GOPHER (and GOPHER is rare because its so old) and usually nothing else. To get anything else through it you would need to encapsulate it in one of those protocols (for example HTTP tunneling, sockisfy-ing the traffic, etc)
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  3. #13
    Member
    Join Date
    Dec 2007
    Location
    @InterN0T
    Posts
    315

    Default

    Quote Originally Posted by Virchanza View Post
    @MaXe Legend

    You can get anonymity from a plain ol' router by using NAT (i.e. Network Address Translation). You'll find that NAT is being used on the vast majority of networks nowadays that have an internet connection.
    Anonymity is NOT equal to NAT. In fact almost every home-router has NAT
    builtin, and does it add any extra Anonymity besides that you can't get
    pinged directly but still receive connections you have opened and be infected
    with a reverse_tcp trojan? No.

    Proxies are either about A) Control or B) Anonymity. Reverse proxies is a whole different topic! (just to mention it)

    NAT stands for Network Address Translation (see LINK) and it's primary
    function today is to save the use of IPv4 addresses. (Because you can
    have a whole 10.0.0.0/8 internal network translated to 1 NAT address
    instead of having a whole IP-range without NAT).

    That's just my opinion, but in essence: NAT is NOT anonymity.
    [quote][I]I realized, that I had fallen down from the top of the mountain into a deep, terrifying and dark hole, just to find out that another mountain in front of me, much greater than the previous, was the next step in life. I began to wander uphill on the next mountain of life while I knew it would be much harder than the previous mountain. [/I]- MaXe[/quote]

  4. #14
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Quote Originally Posted by MaXe Legend View Post
    Anonymity is NOT equal to NAT. In fact almost every home-router has NAT
    builtin, and does it add any extra Anonymity besides that you can't get
    pinged directly but still receive connections you have opened and be infected
    with a reverse_tcp trojan? No.
    Well it depends if the public address that the NAT is translating to can be tied to you or not. My office uses NAT, so in this case NAT would provide me some anonymity to people out on the Internet. They may know that someone from my office communicated with them, but they wouldn't be able to tie it directly to me. Without appropriate internal logging being performed, my organisation wouldn't be able to tie the communication to me either (although in this case, the appropriate logging is being performed). In this case though, the communication can still be traced back to the "owner" of the network (my organisation), regardless of whether it can be tied to the actual system used, so the communication still can be traced back to a party who should be responsible for it (my organisation).

    However in the case of my home ADSL account, any communication from my home network can be tied back to me via the one public IP address all of my machines use for NAT. So no anonymity for me there.
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  5. #15
    Very good friend of the forum Virchanza's Avatar
    Join Date
    Jan 2010
    Posts
    863

    Default

    Quote Originally Posted by lupin View Post
    This would probably need to be integrated into the IP stack to work effeciently, which would mean kernel integration, and which would make it harder to have the proxy being portable across Operating Systems (even amongst Unix like Operating Systems, like Linux, BSD, etc).

    No need to integrate into the kernel. Recently I wrote a program that dealt with manipulating raw Ethernet frames, and it wasn't much of a task at all to keep it portable.

    All Squid would have to do is listen out for IP packets whose IP address is different from its own IP address, and then discard or forward these packets depending on the filter settings.

    Its starting to get more into the application firewall space rather than the proxy space. There are some NetFilter extensions for Linux that are going in that direction, but they are not as functional as a proxy just yet.
    I think it's time the two merged, I'd love to see Squid working on ALL ports with ALL protocols.

    I grabbed the GET requests for a visit to www.microsoft.com from the HTTP headers for two different web requests, one using a proxy and the other not using a proxy (which are the type of requests a transparent proxy gets)

    No proxy (or transparent proxy):
    GET / HTTP/1.1

    Proxy:
    GET http://www.microsoft.com/ HTTP/1.1
    My own website is hosted on a webserver that contains thousands of other people's websites. Therefore, if you type the webserver's IP address into your webbrowser, the webserver doesn't know which site you want, so it just comes up blank.

    I opened up Wireshark and tried to access my own website, virjacode.com. Without going through a non-transparent proxy server, a HTTP GET request was made for "/". So my first thought was, "how the hell does the webserver know to respond with Virjacode?!", but then I looked further into the HTTP datagram and saw a section saying "Host: virjacode.com". So there you have it, even if the GET request is for "/", it still gets the URL. I even tried it out with different port numbers, and sure enough the port number became listed under "Host: virjacode.com:80".

    So in the end, the non-transparent proxy server doesn't actually get more information when it comes to HTTP.

    Making it work with the IP forwarding system is much more complicated, would be much harder to maintain portability for, has the potential to slow down all traffic, would require it to understand all possible higher level protocols, etc, etc. It would definitely be useful, but probably not that easy to actually implement.
    Really I disagree entirely. It could be run fast, and be easy to write (especially with all the cross-platform networking libraries available to us nowadays, e.g. Berkeley Sockets API).


    You would probably find that NTP/SNTP wouldn't work through a non transparent proxy anyway. Most usually only support some combination of HTTP/HTTPS/FTP/SOCKS/GOPHER (and GOPHER is rare because its so old) and usually nothing else. To get anything else through it you would need to encapsulate it in one of those protocols (for example HTTP tunneling, sockisfy-ing the traffic, etc)
    Another thing about the "perfect" transparent proxy server is that it would work with ANY higher level protocol, and on ANY port, because all its doing is discarding or forwarding IP packets unaltered.

    Regarding NAT not offering anonymitiy, well it depends how many people are behind it. If your ISP gives you a phone call because you're downloading pirate videos, you can say "Well it must have been my sister because it wasn't me!". If you're in a large internet cafe or computer room with dozens of computers, you can just say "I haven't a clue what you're talking about".
    Ask questions on the open forums, that way everybody benefits from the solution, and everybody can be corrected when they make mistakes. Don't send me private messages asking questions that should be asked on the open forums, I won't respond. I decline all "Friend Requests".

  6. #16
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Quote Originally Posted by Virchanza View Post
    No need to integrate into the kernel. Recently I wrote a program that dealt with manipulating raw Ethernet frames, and it wasn't much of a task at all to keep it portable.

    All Squid would have to do is listen out for IP packets whose IP address is different from its own IP address, and then discard or forward these packets depending on the filter settings.
    Don't forget individual IP packets don't necessarily contain all of the higher level information needed for filtering decisions. In a given request for a web site the URL may be contained in only one or two packets. How do you go about filtering the rest of the packets in that communication? Don't forget if you drop them TCP reliable delivery mechanisms will just retransmit, because the URL is transferred after the three way handshake to initiate a session is complete.

    Assuming its not running as a socket application, to work effectively the proxy needs to cache traffic so it can reassemble enough of the data portion of the packet to perform processing on, perform pattern matching to determine what type of higher level protocol is being used, extract relevant fields for processing (e.g. URL), perform filtering on those fields, terminate connections it wants to block (by sending TCP RSTs maybe).

    It has to extend the way the IP stack works. Its different to manipulating raw Ethernet, because it has to make use of all of the logic that already exists in a system for dealing with packets.

    Maybe you can do it outside of the kernel, but the OS TCP/IP stack is put there for performance reasons, and if you attempt to extend/rewrite that there is likely to be a performance penalty.

    Quote Originally Posted by Virchanza View Post
    I think it's time the two merged, I'd love to see Squid working on ALL ports with ALL protocols.
    Check with the squid devs and see what they say.


    Quote Originally Posted by Virchanza View Post
    My own website is hosted on a webserver that contains thousands of other people's websites. Therefore, if you type the webserver's IP address into your webbrowser, the webserver doesn't know which site you want, so it just comes up blank.

    I opened up Wireshark and tried to access my own website, virjacode.com. Without going through a non-transparent proxy server, a HTTP GET request was made for "/". So my first thought was, "how the hell does the webserver know to respond with Virjacode?!", but then I looked further into the HTTP datagram and saw a section saying "Host: virjacode.com". So there you have it, even if the GET request is for "/", it still gets the URL. I even tried it out with different port numbers, and sure enough the port number became listed under "Host: virjacode.com:80".

    So in the end, the non-transparent proxy server doesn't actually get more information when it comes to HTTP.
    Its true that the Host header contains this information, but the Host header isn't mandatory for a HTTP 1.0 request, and the proxy server needs to be able to deal with requests from clients that don't use it. The header was added as an extension to deal with Virtual Hosts on a webserver in HTTP 1.1.

    Quote Originally Posted by Virchanza View Post
    Really I disagree entirely. It could be run fast, and be easy to write (especially with all the cross-platform networking libraries available to us nowadays, e.g. Berkeley Sockets API).
    There's plenty of socket style APIs true. But this isn't a socket application you're talking about, its extension of the IP stack. Socket apps listen on one or more sockets, with a socket being defined as a bound IP address and a TCP or UDP port number, and when they do this, they usually claim exclusive access over that socket (ip/port combination). Sockets don't work as part of the routing system.

    You can consider sockets as layer 5 and up of the OSI model, where the lower levels hand a socket a nice stream of bits to work with, and the socket app doesn't have to worry about how it happened. To perform work on packets without using sockets, you have to write something that can take care of the details in layers 3 /4 of the stack as well as performing the higher level checking that you need to do. And as I mentioned, most current Operating Systems have this level 3 and 4 code that you need to replace/extend running as part of the kernel.

    Quote Originally Posted by Virchanza View Post
    Another thing about the "perfect" transparent proxy server is that it would work with ANY higher level protocol, and on ANY port, because all its doing is discarding or forwarding IP packets unaltered.
    If you want to make decisions based on data portions of the TCP segment, you have to understand that data. How do you expect the proxy to do something like filtering on URLs if it doesn't understand how to find a URL in the traffic it sees? Something that can filter on any higher level protocol and on any port is basically just a plain old firewall which has to base its filtering decisions on data in the TCP and IP headers.

    Quote Originally Posted by Virchanza View Post
    Regarding NAT not offering anonymitiy, well it depends how many people are behind it. If your ISP gives you a phone call because you're downloading pirate videos, you can say "Well it must have been my sister because it wasn't me!". If you're in a large internet cafe or computer room with dozens of computers, you can just say "I haven't a clue what you're talking about".
    Agreed, NAT provides anonymity only if the public address cannot be tied to you personally.
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  7. #17
    Member
    Join Date
    Dec 2007
    Location
    @InterN0T
    Posts
    315

    Default

    I agree 100% with Lupin. However at most internet café's everything is logged as well.
    At companies they usually have logging as well if it's a proper company. So it doesn't
    really give any anonymity, except that it from an external perspective it can't be tied
    to you directly, but there is usually more than 1 way.

    If your ISP calls you and say you're downloading videos they don't really care if you
    say "It was my sister or someone else" because it's STILL your connection and YOUR
    responsibility if you're the owner. Or in this case perhaps your parents.

    But if there's no logging or similar then it shouldn't be possible to tie it to One exactly.
    But it can still be tied to the external IP which will held someone responsible that shouldn't,
    in this case either your home connection or your company. (and companies wouldn't
    like cases like this. It would be taken quite seriously as the company can be pursued by law).

    So in my opinion, NAT only provides anonymity IF the address can't be tied to you or
    any affiliates/colleagues/etc that you might know directly. Proxies and VPN's are better solutions.

    So yes, i generally agree with Lupin.

    There's no more valid arguments that NAT offers more anonymity than the above. Which is usually quite poor. NAT wasn't ment for this at all.
    [quote][I]I realized, that I had fallen down from the top of the mountain into a deep, terrifying and dark hole, just to find out that another mountain in front of me, much greater than the previous, was the next step in life. I began to wander uphill on the next mountain of life while I knew it would be much harder than the previous mountain. [/I]- MaXe[/quote]

  8. #18
    Good friend of the forums
    Join Date
    Jun 2008
    Posts
    425

    Default

    @ lupin
    Wouldn't a program like socat do that, it gets say all the TCP connection and send it out on one port, or combination of the above.

    Unless you change the window size, you should only have to store about two packets to get the fill picture, and just use the syn/ack numbers to work out the order(GET /alongweburladdress) won't take up more than 8000 bytes in theory.

  9. #19
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Quote Originally Posted by compaq View Post
    @ lupin
    Wouldn't a program like socat do that, it gets say all the TCP connection and send it out on one port, or combination of the above.
    From the socat readme:

    socat is a relay for bidirectional data transfer between two independent data channels. Each of these data channels may be a file, pipe, device (serial line etc. or a pseudo terminal), a socket (UNIX, IP4, IP6 - raw, UDP, TCP), an SSL socket, proxy CONNECT connection, a file descriptor (stdin etc.), the GNU line editor (readline), a program, or a combination of two of these. These modes include generation of "listening" sockets, named pipes, and pseudo terminals.
    So it basically uses well defined interfaces for which APIs exist, such as sockets, pipes, etc.

    Quote Originally Posted by compaq View Post
    Unless you change the window size, you should only have to store about two packets to get the fill picture, and just use the syn/ack numbers to work out the order(GET /alongweburladdress) won't take up more than 8000 bytes in theory.
    Depends on the protocol used. For HTTP you would normally see most of the HTTP header in the first few packets sure. That's assuming you want to filter on the URL and not another header (Referer, Mime-Type, Cookie, etc) in which case you may need to capture slightly more, or the page contents in which case you need to capture much more of the communication. And don't forget you need to capture enough of each and every connection to classify what high level protocol is used - or at least to rule out further filtering of higher level protocols you are not interested in.

    Really, this page below says it better than I could. It concerns layer 7 filtering in Netfilter (packet filtering for the Linux kernel), which is basically filtering on packet contents beyond the IP/TCP header. This has been implemented as a kernel patch, not a running service. Notice that the higher level applications supported are specifically listed.

    In particular, read the section entitled "Blocking"

    http://l7-filter.sourceforge.net/HOWTO
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  10. #20
    Good friend of the forums
    Join Date
    Jun 2008
    Posts
    425

    Default

    @ lupin
    It would come in handy, but the blocking makes alot of false postives(guess its hard work out all the protcols. )

    Cheers

Page 2 of 3 FirstFirst 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •