Category Archives: Security

JNCIP-SEC – Configuring Juniper Enhanced Web Filtering

Two more weeks left to take the JNCIP-SEC exam, or I will lose my JNCIS-SEC credentials. I have a fully loaded SRX320 cluster (with advanced feature licenses) and a vSRX in place so I need to lab up quickly. First feature on the list is Juniper Enhanced Web filtering.

Before you begin the configuration, make sure you meet the following prerequisites:

  • The device has a valid wf_key_websense_ewf license
  • The egress interface has internet connectivity
  • The device has working name-servers defined
  • If you are enabling the feature on a virtual router, be sure to define the VR under [edit security utm feature-profile web-filtering]

To activate the Juniper Enhanced Web Filtering feature, enter the following command:

[edit security utm feature-profile web-filtering]
root@NP-vSRX# show
type juniper-enhanced;

Next, enable the UTM feature profile for Juniper Enhanced Web Filtering and configure the Juniper-managed Websense server URL.

[edit security utm feature-profile web-filtering]
root@NP-vSRX# show
juniper-enhanced {
    server {
        host rp.cloud.threatseeker.com;
        port 80;
    }
}

Before we configure the Websense categories, we should define custom whitelists and blacklists, which can be used to overrule unexpected results from the automated Websense categorization.

Under the UTM custom-objects, we’ll create an URL pattern for one of our favorite networking-related forums and for a blacklisted adult site. These are then addded to custom black/whitelist categories.

[edit security utm custom-objects]
root@NP-vSRX# show
url-pattern {
    URL-networking-forumsdotcom {
        value [ http://networking-forums.com 64.90.58.116 ];
    }
    URL-sexdotcom {
        value [ sex.com 206.125.164.82 ];
    }
}
custom-url-category {
    EWF-Blacklist {
        value URL-sexdotcom;
    }
    EWF-Whitelist {
        value URL-networking-forumsdotcom;
    }
}

Our two custom URL categories are added under the global web-filtering configuration.

[edit security utm feature-profile web-filtering]
root@NP-vSRX# show
url-whitelist EWF-Whitelist;
url-blacklist EWF-Blacklist;

As for the Enhanced Web-filtering example, I’ll create one profile for standard internet access called “WF-DefaultInternetAccess”. Adult content and malicious URLs/IP addresses will be blocked by default and some ‘undesirable’ categories will be allowed, yet a security event will be logged. The default action will be set to “permit”.

[edit security utm feature-profile web-filtering juniper-enhanced]
root@NP-vSRX# show
cache;
server {
    host rp.cloud.threatseeker.com;
    port 80;
}
profile WF-DefaultInternetAccess {
    category {
        Enhanced_Adult_Content {
            action block;
        }
        Enhanced_Adult_Material {
            action block;
        }
        Enhanced_Advanced_Malware_Command_and_Control {
            action block;
        }
        Enhanced_Advanced_Malware_Payloads {
            action block;
        }
        Enhanced_Bot_Networks {
            action block;
        }
        Enhanced_Dynamic_DNS {
            action block;
        }
        Enhanced_Emerging_Exploits {
            action block;
        }
        Enhanced_Keyloggers {
            action block;
        }
        Enhanced_Malicious_Embedded_Link {
            action block;
        }
        Enhanced_Malicious_Embedded_iFrame {
            action block;
        }
        Enhanced_Malicious_Web_Sites {
            action block;
        }
        Enhanced_Mobile_Malware {
            action block;
        }
        Enhanced_Parked_Domain {
            action block;
        }
        Enhanced_Phishing_and_Other_Frauds {
            action block;
        }
        Enhanced_Proxy_Avoidance {
            action block;
        }
        Enhanced_Spyware {
            action block;
        }
        Enhanced_Abused_Drugs {
            action log-and-permit;
        }
        Enhanced_Alcohol_and_Tobacco {
            action log-and-permit;
        }
        Enhanced_Drugs {
            action log-and-permit;
        }
        Enhanced_Gambling {
            action log-and-permit;
        }
        Enhanced_Games {
            action log-and-permit;
        }
        Enhanced_Hacking {
            action log-and-permit;
        }
        Enhanced_Illegal_or_Questionable {
            action log-and-permit;
        }
        Enhanced_Nudity {
            action log-and-permit;
        }
        Enhanced_Racism_and_Hate {
            action log-and-permit;
        }
        Enhanced_Sex {
            action log-and-permit;
        }
    }
    default permit;
    fallback-settings {
        default log-and-permit;
        server-connectivity block;
        timeout block;
        too-many-requests block;
    }
}

In case a URL does not have a category assigned but only a Web Reputation Score (WBRS) we can define an action based on that score.

Note – the scores are weighted as follows:

  • 100-90–Site is considered very safe.
  • 80-89–Site is considered moderately safe.
  • 70-79–Site is considered fairly safe.
  • 60-69–Site is considered suspicious.
  • 0-59–Site is considered harmful.
[edit security utm feature-profile web-filtering juniper-enhanced profile WF-DefaultInternetAccess site-reputation-action]
root@NP-vSRX# show
very-safe permit;
moderately-safe permit;
fairly-safe log-and-permit;
suspicious quarantine;
harmful block;

For some special cases, in which the device might not be able to process the HTTP request, we can define fallback options. This basically equates to a fail-open/fail-closed decision.

[edit security utm feature-profile web-filtering juniper-enhanced profile WF-DefaultInternetAccess fallback-settings]
root@NP-vSRX# show
default log-and-permit;
server-connectivity block;
timeout block;
too-many-requests block;

With most pieces in place, the full web-filtering profile looks like this:

[edit security utm feature-profile]
root@NP-vSRX# show
web-filtering {
    url-whitelist EWF-Whitelist;
    url-blacklist EWF-Blacklist;
    type juniper-enhanced;
    juniper-enhanced {
        cache;
        server {
            host rp.cloud.threatseeker.com;
            port 80;
        }
        profile WF-DefaultInternetAccess {
            category {
                Enhanced_Adult_Content {
                    action block;
                }
                Enhanced_Adult_Material {
                    action block;
                }
                Enhanced_Advanced_Malware_Command_and_Control {
                    action block;
                }
                Enhanced_Advanced_Malware_Payloads {
                    action block;
                }
                Enhanced_Bot_Networks {
                    action block;
                }
                Enhanced_Dynamic_DNS {
                    action block;
                }
                Enhanced_Emerging_Exploits {
                    action block;
                }
                Enhanced_Keyloggers {
                    action block;
                }
                Enhanced_Malicious_Embedded_Link {
                    action block;
                }
                Enhanced_Malicious_Embedded_iFrame {
                    action block;
                }
                Enhanced_Malicious_Web_Sites {
                    action block;
                }
                Enhanced_Mobile_Malware {
                    action block;
                }
                Enhanced_Parked_Domain {
                    action block;
                }
                Enhanced_Phishing_and_Other_Frauds {
                    action block;
                }
                Enhanced_Proxy_Avoidance {
                    action block;
                }
                Enhanced_Spyware {
                    action block;
                }
                Enhanced_Abused_Drugs {
                    action log-and-permit;
                }
                Enhanced_Alcohol_and_Tobacco {
                    action log-and-permit;
                }
                Enhanced_Drugs {
                    action log-and-permit;
                }
                Enhanced_Gambling {
                    action log-and-permit;
                }
                Enhanced_Games {
                    action log-and-permit;
                }
                Enhanced_Hacking {
                    action log-and-permit;
                }
                Enhanced_Illegal_or_Questionable {
                    action log-and-permit;
                }
                Enhanced_Nudity {
                    action log-and-permit;
                }
                Enhanced_Racism_and_Hate {
                    action log-and-permit;
                }
                Enhanced_Sex {
                    action log-and-permit;
                }
            }
            site-reputation-action {
                very-safe permit;
                moderately-safe permit;
                fairly-safe log-and-permit;
                suspicious quarantine;
                harmful block;
            }
            default permit;
            fallback-settings {
                default log-and-permit;
                server-connectivity block;
                timeout block;
                too-many-requests block;
            }
        }
    }
}

Next, we need configure a new UTM policy which includes the Enhanced Web Filtering feature. This can be combined with other UTM features such as Sophos anti-virus or content filtering.

[edit security utm]
root@NP-vSRX# set utm-policy UTM-DefaultInternetAccess web-filtering http-profile WF-DefaultInternetAccess

The last step is to attach the UTM profile to a security policy by configuring the application-services action.

[edit security policies from-zone internal-np to-zone internal-transit policy FW-Internal-PermitAll]
root@NP-vSRX# show
match {
    source-address any;
    destination-address any;
    application any;
}
then {
    permit {
        application-services {
            utm-policy UTM-DefaultInternetAccess;
        }
    }
}

Configuring logging

To make webfilter troubleshooting easier, configure the following syslog files, or alternatively send them to a syslog server.

[edit system syslog]
root@NP-vSRX# show
file blocked-sites {
    any any;
    match WEBFILTER_URL_BLOCKED;
}
file allowed-sites {
    any any;
    match WEBFILTER_URL_PERMITTED;
}

Testing

Testing the custom blacklist by browsing to sex.com – HTTPS session was broken (based on destination IP) and the browser returned an error.

Oct 13 14:21:14  NP-vSRX RT_UTM: WEBFILTER_URL_BLOCKED: WebFilter: ACTION="URL Blocked" 10.255.1.13(50278)->206.125.164.82(443) CATEGORY="EWF-Blacklist" REASON="BY_BLACK_LIST" PROFILE="WF-DefaultInternetAccess" URL=206.125.164.82 OBJ=/ username N/A roles N/A

Next, we test the URL in the whitelist (which was actually be blocked by Websense). Networking-forums loads fine now and we have this in the log:

Oct 13 14:23:43  NP-vSRX RT_UTM: WEBFILTER_URL_PERMITTED: WebFilter: ACTION="URL Permitted" 10.255.1.13(50303)->64.90.58.116(443) CATEGORY="EWF-Whitelist" REASON="BY_WHITE_LIST" PROFILE="WF-DefaultInternetAccess" URL=64.90.58.116 OBJ=/ username N/A roles N/A

As a last test, we initiate an HTTP session to http://www.penthouse.com
In the browser, we get the following message:

Juniper Networks Firewall has blocked the URL: 10.255.1.13(62175)->64.59.126.214(80) www.penthouse.com/ CATEGORY: Enhanced_Adult_Content REASON: BY_PRE_DEFINED

And in the syslog file, it looks like this:

Oct 13 14:33:29  NP-vSRX RT_UTM: WEBFILTER_URL_BLOCKED: WebFilter: ACTION="URL Blocked" 10.255.1.13(62175)->64.59.126.214(80) CATEGORY="Enhanced_Adult_Content" REASON="BY_PRE_DEFINED" PROFILE="WF-DefaultInternetAccess" URL=www.penthouse.com OBJ=/ username N/A roles N/A

Verification Commands

To check connectivity between your SRX and the Juniper Websense server:

root@NP-vSRX> show security utm web-filtering status
 UTM web-filtering status:
    Server status: Juniper Enhanced using Websense server UP

Web-filtering statistics – this is useful when the service is not working as expected.

root@NP-vSRX> show security utm web-filtering statistics
 UTM web-filtering statistics:
    Total requests:                     17086
    white list hit:                     20
    Black list hit:                     20
    No license permit:                  0
    Queries to server:                  6472
    Server reply permit:                1868
    Server reply block:                 9
    Server reply quarantine:            0
    Server reply quarantine block:      0
    Server reply quarantine permit:     0
    Custom category permit:             0
    Custom category block:              0
    Custom category quarantine:         0
    Custom category qurantine block:    0
    Custom category quarantine permit:  0
    Site reputation permit:             10597
    Site reputation block:              2495
    Site reputation quarantine:         49
    Site reputation quarantine block:   0
    Site reputation quarantine permit:  1170
    Site reputation by Category         0
    Site reputation by Global           14311
    Cache hit permit:                   847
    Cache hit block:                    11
    Cache hit quarantine:               0
    Cache hit quarantine block:         0
    Cache hit quarantine permit:        0
    Safe-search redirect:               0
    Web-filtering sessions in total:    128000
    Web-filtering sessions in use:      2
    Fallback:                       log-and-permit           block
          Default                                 0               0
          Timeout                                 0               0
     Connectivity                                 0               0
Too-many-requests                                 0               0

That’s pretty much all there is to it for a basic EWF setup. There are many more nerdknobs you can tune but I’ll stick to the basics for my JNCIP-SEC studies.

Bear in mind that the URL filter only works for HTTP traffic and most of the web these day is on SSL and using CDN networks, so it’s very hard to get a good match on SSL traffic. This can be solved by using SSL Forward Proxy or relying more on Layer-7 AppID with custom application signatures.

Sources:

Juniper Enhanced Web Filtering feature guide.

IPv6 on JunOS – Static Routing

In my previous post I configured IPv6 with Prefix Delegation. IPv6-PD is alright for a simple topology but it is very, very limiting and seems quite complicated to implement when you have many subnets in a complex topology.

This post will cover static IPv6 routing. Ideally I would have implemented OSPFv3 to handle dynamic routing but sadly, my switches do not support that feature, even with the EFL license.

Lab Topology

Here is a small part of my network which I’ll be using to test it.

IPv6 Topology

Notice that the client VLAN is the only segment with a globally routed IPv6 address. With IPv6, it is perfectly fine to use only link-local (non-routed) addressing on your transit networks, and in fact it has several benefits! If you’re curious what those benefits are, you can read all about it in RFC7407 by Eric Vyncke and Michael Behringer.

Addressing configuration

My provider has assigned me two static prefixes to me. These are not the real ones btw 🙂

IPv6 LAN prefix 	2a02:1234:420a::/48
IPv6 WAN prefix 	2a02:1234:8401:9a00::/64

First, let’s configure our WAN interface with a static IP address in the WAN range.

[edit]
admin@NPFW01# set interfaces pp0 unit 0 family inet6 address 2a02:1234:8401:9a00::1/64

As I’m using PPPoE, I will need to add a static default route (::/0) with the pp0.0 interface as next-hop. Also, I’m using VRFs so I need to add it under the right routing instance.

[edit routing-instances VRF-Edge]
admin@NPFW01# set routing-options rib VRF-Edge.inet6.0 static route ::/0 next-hop pp0.0

A quick ping to Google to verify that we have IPv6 connectivity on the external interface.

admin@NPFW01> ping inet6 2001:4860:4860::8888 routing-instance VRF-Edge source 2a02:1234:8401:9a00::1 rapid count 10
PING6(56=40+8+8 bytes) 2a02:1234:8401:9a00::1 --> 2001:4860:4860::8888
!!!!!!!!!!
--- 2001:4860:4860::8888 ping6 statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max/std-dev = 30.980/31.647/32.716/0.444 ms

So far, so good. Let’s configure the internal interfaces now.

Configuration for the link-local addresses on the Firewall:

[edit interfaces ae0 unit 90 family inet6]
admin@NPFW01# show
address fe80:90::2/64;
[edit interfaces ae1 unit 10 family inet6]
admin@NPFW01# show
address fe80:10::2/64;
[edit interfaces ae1 unit 100 family inet6]
admin@NPFW01# show
address fe80:100::2/64;

And for the switch:

{master:0}[edit interfaces vlan unit 10 family inet6]
admin@NPSWC01# show
address fe80:10::1/64;
{master:0}[edit interfaces vlan unit 90 family inet6]
admin@NPSWC01# show
address fe80:90::1/64;
{master:0}[edit interfaces vlan unit 100 family inet6]
admin@NPSWC01# show
address fe80:100::1/64;
{master:0}[edit interfaces vlan unit 101 family inet6]
admin@NPSWC01# show
address fe80:101::1/64;

In my case, the interfaces are already in the right routing instances so I don’t need to add them anymore. Don’t forget this step though, if you are recreating this in your own lab.

After a commit, we can confirm reachability between the devices inside their routing instances.

admin@NPSWC01> ping fe80:90::2 routing-instance VRF-Transit source fe80:90::1 rapid count 5
PING6(56=40+8+8 bytes) fe80:90::1 --> fe80:90::2
!!!!!
--- fe80:90::2 ping6 statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/std-dev = 3.981/4.689/6.566/0.947 ms
admin@NPSWC01> ping fe80:10::2 routing-instance VRF-Transit source fe80:10::1 rapid count 5
PING6(56=40+8+8 bytes) fe80:10::1 --> fe80:10::2
!!!!!
--- fe80:10::2 ping6 statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/std-dev = 4.181/5.018/6.296/0.731 ms
admin@NPSWC01> ping fe80:100::2 routing-instance VRF-NP source fe80:100::1 rapid count 5
PING6(56=40+8+8 bytes) fe80:100::1 --> fe80:100::2
!!!!!
--- fe80:100::2 ping6 statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/std-dev = 3.924/4.819/5.715/0.575 ms

Both devices should now have an entry for eachother in the IPv6 neighbour tables, which is the equivalent of the v4 ARP table.

admin@NPSWC01> show ipv6 neighbors
IPv6 Address                 Linklayer Address  State       Exp Rtr Secure Interface
fe80:10::2                   a8:d0:e5:d3:2a:81  stale       927 yes no      vlan.10
fe80:90::2                   a8:d0:e5:d3:2a:80  stale       896 yes no      vlan.90
fe80:100::2                  a8:d0:e5:d3:2a:81  stale       887 yes no      vlan.100

Client network addressing

Last step is to assign the VLAN201 interface with a publically routed interface. I will be configuring this address: 2a02:1234:420a:10c9::1/64

There are probably more elegant addressing designs but this is what I came up with:

  • The first 48 bits are our globally routed prefix.
  • For the next 16 bits, I’ve used the leading 4 bits to represent the site number. This is my first site, so number 1.
  • The last 12 bits represent the VLAN number in hexadecimal notation, as there are 4096 vlans available in the dot1q standard – VLAN201 is C9 in hex!

Granted, it’s not the most scalable solution, as it only gives me 15 more sites and it wastes a lot of addresses but hey, this is not a multinational yet! 🙂

Here’s the configuration for the VLAN201 SVI:

{master:0}[edit interfaces vlan unit 201 family inet6]
admin@NPSWC01# show
address 2a02:1234:420a:10c9::1/64;

Enabling Router Advertisements

To enable dynamic address assignments (SLAAC) for the clients behind it, we also need to enable router advertisements on this interface. The other-stateful-configuration command will add the O-Flag to the advertisements, so we can later add the additional DNS information via DHCPv6.

{master:0}[edit protocols router-advertisement interface vlan.201]
admin@NPSWC01# show
other-stateful-configuration;
prefix 2a02:1234:420a:10c9::/64;
}

After committing, the client has assigned itself an IPv6 based on the received router advertisements.

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : ***-networks.local
   IPv6 Address. . . . . . . . . . . : 2a02:1234:420a:10c9:b1d1:660c:ac69:c5f8
   Temporary IPv6 Address. . . . . . : 2a02:1234:420a:10c9:3c71:b6f2:e93f:67ab
   Link-local IPv6 Address . . . . . : fe80::b1d1:660c:ac69:c5f8%6
   IPv4 Address. . . . . . . . . . . : 10.255.1.13
   Subnet Mask . . . . . . . . . . . : 255.255.255.128
   Default Gateway . . . . . . . . . : fe80::5e45:27ff:fee7:af81%6
                                       10.255.1.1

The basics are now in place. Now we will add some routing to make this prefix reachable.

Configuring static routes

As mentioned, my EX2200 switches do not support OSPFv3 so static routing is the only option for now…

Note – If you are using link-local addresses as the next-hop, you must use the qualified-next-hop statement with the interface in question.

Static routes, using link-local addresses, added on the switch’s routing-instances:

{master:0}[edit routing-instances VRF-Transit routing-options]
admin@NPSWC01# show
rib VRF-Transit.inet6.0 {
    static {
        route ::/0 {
            qualified-next-hop fe80:90::2 {
                interface vlan.90;
            }
        }
        route 2a02:1234:420a:10c9::/64 {
            qualified-next-hop fe80:10::2 {
                interface vlan.10;
            }
        }
    }
}
{master:0}[edit routing-instances VRF-NP routing-options]
admin@NPSWC01# show
rib VRF-NP.inet6.0 {
    static {
        route ::/0 {
            qualified-next-hop fe80:100::2 {
                interface vlan.100;
            }
        }
    }
}

And the routes on the firewall:

[edit routing-instances VRF-Internal routing-options]
admin@NPFW01# show
rib VRF-Internal.inet6.0 {
    static {
        route ::/0 {
            qualified-next-hop fe80:10::1 {
                interface ae1.10;
            }
        }
        route 2a02:1234:420a:10c9::/64 {
            qualified-next-hop fe80:100::1 {
                interface ae1.100;
            }
        }
    }
}
admin@NPFW01# show
rib VRF-Edge.inet6.0 {
    static {
        route ::/0 next-hop pp0.0;
        route 2a02:1234:420a:10c9::/64 {
            qualified-next-hop fe80:90::1 {
                interface ae0.90;
            }
        }
    }
}

After adding this configuration, and assuming that the necessary firewall policies are in place, the client can now ping outbound to the internet.

C:\Users\user>ping 2001:4860:4860::8888

Pinging 2001:4860:4860::8888 with 32 bytes of data:
Reply from 2001:4860:4860::8888: time=30ms
Reply from 2001:4860:4860::8888: time=30ms
Reply from 2001:4860:4860::8888: time=31ms
Reply from 2001:4860:4860::8888: time=30ms

Ping statistics for 2001:4860:4860::8888:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 30ms, Maximum = 31ms, Average = 30ms

That’s all it takes to implement static routing for IPv6, easy-peasy!

IPv6 on Juniper SRX – Prefix Delegation & DHCPv6

I’m currently setting up IPv6 in my own office so I thought it would be worth documenting the configuration I’m adding on my Juniper equipment to make it all work.

I have an internet subscription with a static IPv4 address and also included with it is a /48 static IPv6 prefix. There are two ways I can set it up: static configuration or “semi-automatic” by using Prefix Delegation. This article discusses the latter and in my next post I will document the static configuration, as that’s what I will ultimately be using.

First Things First – Enabling IPv6 Forwarding

Before you can start using IPv6 on the SRX, the device needs to be explicitly configured to forward V6 traffic. This because, by default, the forwarding mode for IPv6 is set to drop. You can verify this on the command line – as shown below, the default forwarding mode for Inet6 traffic is drop:

np@NP-FW01> show security flow status
  Flow forwarding mode:
    Inet forwarding mode: flow based
    Inet6 forwarding mode: drop
    MPLS forwarding mode: drop
    ISO forwarding mode: drop
  Flow trace status
    Flow tracing status: off
  Flow session distribution
    Distribution mode: RR-based
  Flow ipsec performance acceleration: off
  Flow packet ordering
    Ordering mode: Hardware

To activate it, you must change the forwarding mode. There are three modes to choose from: drop (default), flow-based and packet-based.

[edit]
np@NP-FW01# set security forwarding-options family inet6 mode ?
Possible completions:
  drop                 Disable forwarding
  flow-based           Enable flow-based forwarding
  packet-based         Enable packet-based forwarding

I’m using the SRX as a stateful firewall, so I’ll configure flow-based and commit. JunOS will warn you that the change is active only after a device reboot, so I’ll reboot the SRX and continue below.

[edit]
np@NP-FW01# set security forwarding-options family inet6 mode flow-based
[edit]
np@NP-FW01# show | compare
[edit security]
+   forwarding-options {
+       family {
+           inet6 {
+               mode flow-based;
+           }
+       }
+   }
[edit]
np@NP-FW01# commit
warning: You have enabled/disabled inet6 flow.
You must reboot the system for your change to take effect.
If you have deployed a cluster, be sure to reboot all nodes.
commit complete

After the reboot, the forwarding mode is set to flow based and we can start implementing IPv6.

np@NP-FW01> show security flow status
  Flow forwarding mode:
    Inet forwarding mode: flow based
    Inet6 forwarding mode: flow based
    MPLS forwarding mode: drop
    ISO forwarding mode: drop
  Flow trace status
    Flow tracing status: off
  Flow session distribution
    Distribution mode: RR-based
  Flow ipsec performance acceleration: off
  Flow packet ordering
    Ordering mode: Hardware

Configuring IPv6 with Prefix Delegation

Prefix delegation is a simple method to automate the assignment of IPv6 prefixes on subscriber equipment. The CPE (in my case the SRX) informs the upstream router, called the Broadband Network Gateway (BNG) that it wants to use Prefix Delegation. The BNG retrieves the prefix for the subscriber and announces the /48 back to the CPE, after which both sides confirm the prefix allocation. The CPE will then automatically divide the /48 into smaller /64 subnets on the LAN interfaces it has configured for PD Router Advertisements.

If you want to find out exactly how it works, you can find more information in RFC3633

First, we configure the outside interface, pp0.0 in my case. Notice that we are configuring this under family inet6.

The DHCPv6 client is activated on the outside, ISP-facing interface. Some PD-specific configuration is added and we will request the DNS server addresses from the provider.

[edit interfaces pp0 unit 0 family inet6]
np@NP-FW01# show
dhcpv6-client {
    client-type statefull;
    client-ia-type ia-pd;
    rapid-commit;
    client-identifier duid-type duid-ll;
    req-option dns-server;
    retransmission-attempt 0;
}

The LAN interface, in my case ae0.311, on which I will further delegate the prefix, is specified. Showing this separately as it’s important not to miss this one! 🙂

[edit interfaces pp0 unit 0 family inet6]
np@NP-FW01# set dhcpv6-client update-router-advertisement interface ae0.311

The last step is to allow DHCPv6 packets on the security zone inbound traffic. Could be that you need to add it under the interface, if you specified it further down on that level.

[edit security zones security-zone edge-untrust]
np@NP-FW01# set host-inbound-traffic system-services dhcpv6

If you want to use this on the LAN segment, you will also need to enable DHCPv6 on that security zone and interface. I’m using a guest VLAN for testing so I’ll enable it there.

[edit]
np@NP-FW01# set security zones security-zone edge-guest host-inbound-traffic system-services dhcpv6

After a commit, we see that interface ae0.311 now has a public IPv6 address with a /64 mask.

np@NP-FW01> show interfaces terse
Interface               Admin Link Proto    Local                 Remote

 ae0.311                 up    up   inet     192.168.200.1/24
                                   inet6    2a02:1234:420a:1::1/64
                                            fe80::aad0:e501:37d3:2a80/64
...

Notice that the SRX has taken the original /48 prefix and automatically derived a /64 subnet from it for interface ae0.311: 2a02:1234:420a:1::1/64 .

Note – real prefix modified for obvious reasons 🙂

Before we can test reachability, we will need to configure the default route via pp0.0. I’m using routing-instances (VRFs) so I will add it on my external edge instance.

[edit routing-instances VRF-Edge routing-options]
np@NP-FW01# show
rib VRF-Edge.inet6.0 {
    static {
        route ::/0 next-hop pp0.0;
    }
}

With the default route in place, we should be able to ping a public v6 address from this VRF. Below, we are pinging Google DNS, from the interface with Prefix Delegation.

np@NP-FW01> ping 2001:4860:4860::8888 routing-instance VRF-Edge rapid count 10 source 2a02:1234:420a:1::1
PING6(56=40+8+8 bytes) 2a02:1234:420a:1::1 --> 2001:4860:4860::8888
!!!!!!!!!!
--- 2001:4860:4860::8888 ping6 statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max/std-dev = 31.811/32.841/35.861/1.163 ms

Success! This shows that IPv6-PD is working as expected and that the delegated prefix is globally reachable.

After connecting a test machine inside the VLAN, we can confirm that the router advertisements are working as expected. The client has taken the /64 prefix and assigned itself an EUI-64 address.

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : netprobe.guest
   Description . . . . . . . . . . . : Intel(R) Ethernet Connection (4) I219-V
   Physical Address. . . . . . . . . : C8-5B-76-BC-04-6E
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv6 Address. . . . . . . . . . . : 2a02:1234:420a:1:103f:efd8:f2f7:d0cf(Preferred)
   Temporary IPv6 Address. . . . . . : 2a02:1234:420a:1:293c:fd60:c216:8123(Preferred)
   Link-local IPv6 Address . . . . . : fe80::103f:efd8:f2f7:d0cf%14(Preferred)
   IPv4 Address. . . . . . . . . . . : 192.168.200.13(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : zondag 12 november 2017 15:23:41
   Lease Expires . . . . . . . . . . : maandag 13 november 2017 15:26:16
   Default Gateway . . . . . . . . . : fe80::aad0:e501:37d3:2a80%14
                                       192.168.200.1
   DHCP Server . . . . . . . . . . . : 192.168.200.1
   DHCPv6 IAID . . . . . . . . . . . : 63462262
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-20-35-58-1B-C8-5B-76-BC-04-6E
   DNS Servers . . . . . . . . . . . : 208.67.220.123
                                       208.67.222.123
   NetBIOS over Tcpip. . . . . . . . : Enabled

Now, this part seems to be working fine, but you may notice that the client has not received any IPv6 name servers.
This is because the client uses SLAAC to configure its v6 address based on the Router Advertisements, but it still relies on DHCPv6 to receive any special information such as DNS servers.

To make this work, we need to add a few more bits of configuration to use the SRX as a a DHCPv6 server.

Handing out DNS servers via DHCPv6

First, configure the DHCPv6 pool with the prefix and the DNS server you want to hand out. I’m using Google DNS here.

np@NP-FW01# show routing-instances VRF-Edge access address-assignment
pool vlan311-dhcpv6 {
    family inet6 {
        prefix 2a02:1234:420a:1::/64;
        dhcp-attributes {
            dns-server {
                2001:4860:4860::8888;
                2001:4860:4860::8844;
            }
        }
    }
}

Next, configure the internal DHCPv6 server under system services, referring to the pool you just created – add it under routing-instances if required!

np@NP-FW01# show routing-instances VRF-Edge system services dhcp-local-server
dhcpv6 {
    overrides {
        interface-client-limit 100;
        process-inform {
            pool vlan311-dhcpv6;
        }
    }
    group ipv6 {
        interface ae0.311;
    }
}
group group-vlan311 {
    interface ae0.311;
}

Activate Router Advertisements for your outside interface. It doesn’t seem to work for me without this.

[edit]
np@NP-FW01# show protocols router-advertisement
interface pp0.0;

The following bit controls the Router Advertisements on the delegated interface and activates the O-Flag (Other Information) on the RAs.

[edit interfaces pp0 unit 0 family inet6 dhcpv6-client update-router-advertisement interface ae0.311]
+         other-stateful-configuration;
+         max-advertisement-interval 6;
+         min-advertisement-interval 3;

After adding this configuration, we see the client has now successfully picked up the new DNS settings.

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : netprobe.guest
   Description . . . . . . . . . . . : Intel(R) Ethernet Connection (4) I219-V
   Physical Address. . . . . . . . . : C8-5B-76-BC-04-6E
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv6 Address. . . . . . . . . . . : 2a02:1234:420a:1:103f:efd8:f2f7:d0cf(Preferred)
   Temporary IPv6 Address. . . . . . : 2a02:1234:420a:1:488c:f549:1e67:254c(Preferred)
   Link-local IPv6 Address . . . . . : fe80::103f:efd8:f2f7:d0cf%14(Preferred)
   IPv4 Address. . . . . . . . . . . : 192.168.200.13(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : zondag 12 november 2017 15:23:41
   Lease Expires . . . . . . . . . . : maandag 13 november 2017 17:08:57
   Default Gateway . . . . . . . . . : fe80::aad0:e501:37d3:2a80%14
                                       192.168.200.1
   DHCP Server . . . . . . . . . . . : 192.168.200.1
   DHCPv6 IAID . . . . . . . . . . . : 63462262
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-20-35-58-1B-C8-5B-76-BC-04-6E
   DNS Servers . . . . . . . . . . . : 2001:4860:4860::8888
                                       2001:4860:4860::8844
                                       208.67.220.123
                                       208.67.222.123
   NetBIOS over Tcpip. . . . . . . . : Enabled

After adding a security policy that allows IPv6 traffic, the client can succesfully communicate over IPv6.

C:\Users\Simon>nslookup - 2001:4860:4860::8888
Default Server:  google-public-dns-a.google.com
Address:  2001:4860:4860::8888

> google.com
Server:  google-public-dns-a.google.com
Address:  2001:4860:4860::8888

Non-authoritative answer:
Name:    google.com
Addresses:  2a00:1450:400e:80b::200e
          172.217.20.110

C:\Users\Simon>ping -6 www.google.com

Pinging www.google.com [2a00:1450:400e:806::2004] with 32 bytes of data:
Reply from 2a00:1450:400e:806::2004: time=30ms
Reply from 2a00:1450:400e:806::2004: time=31ms
Reply from 2a00:1450:400e:806::2004: time=31ms
Reply from 2a00:1450:400e:806::2004: time=30ms

Ping statistics for 2a00:1450:400e:806::2004:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 30ms, Maximum = 31ms, Average = 30ms

Verification Commands

Checking which prefix was allocated to you by the ISP – don’t forget to specify the routing-instance if you are using them.

np@NP-FW01> show dhcpv6 client binding routing-instance VRF-Edge

IP/prefix                       Expires     State      ClientType    Interface       Client DUID
2a02:1234:420a::/48              2590728     BOUND      STATEFUL      pp0.0           LL0x29-a8:d0:e5:d3:2a:47

You can renew the prefix lease this way:

np@NP-FW01> request dhcpv6 client renew routing-instance VRF-Edge interface pp0.0

Displaying the IPv6 neighbours – this is similar to your ARP table in IPv4.

np@NP-FW01> show ipv6 neighbors
IPv6 Address                 Linklayer Address  State       Exp Rtr Secure Interface
2a02:1234:420a:1:488c:f549:1e67:254c
                             c8:5b:76:bc:04:6e  stale       1017 no no      ae0.311
fe80::103f:efd8:f2f7:d0cf    c8:5b:76:bc:04:6e  stale       363 no  no      ae0.311

Statistics for the DHCPv6 server:

np@NP-FW01> show dhcpv6 server statistics routing-instance VRF-Edge
Dhcpv6 Packets dropped:
    Total               0

Messages received:
    DHCPV6_DECLINE             1
    DHCPV6_SOLICIT             6
    DHCPV6_INFORMATION_REQUEST 60
    DHCPV6_RELEASE             0
    DHCPV6_REQUEST             3
    DHCPV6_CONFIRM             0
    DHCPV6_RENEW               0
    DHCPV6_REBIND              0
    DHCPV6_RELAY_FORW          0
    DHCPV6_RELAY_REPL          0

Messages sent:
    DHCPV6_ADVERTISE           4
    DHCPV6_REPLY               8
    DHCPV6_RECONFIGURE         0
    DHCPV6_RELAY_REPL          0

Now, if I was running a fairly simple network with just a few subnets, I would be using Prefix Delegation but I also have a couple of L3 devices behind this firewall. Extending PD all the way down would make it overly complex, so I’ll go for static configuration with OSPFv3 in the next article.

Please let me know in the comments if this configuration did not work for you!

Consistent software versions on dual-partition JunOS devices

Modern Junos devices have dual boot partitions, each with their own copy of the operating system. This ensures that the device will boot if storage or other boot-related issues are detected on the primary boot partition.

However, when you manually update the software, it is only updated on one of the partitions which is then set as the boot partition. The second, original, partition will keep running on the previous version until it is manually updated as well. This might be an advantage if you are testing a new software version and want to quickly roll back to the old one. It could also wreak havoc if you are inadvertently falling back to a pre-historic software version that is missing or not compatible with some of the features you’ve since enabled. If you have found a stable software version, it’s best to keep both partitions in sync. Here’s how to do it…

Identifying a software mismatch

The first hint is given at boot time. Right before the login prompt and banner, the console will alert you of the software version disparity.

kern.securelevel: -1 -> 1
Creating JAIL MFS partition...
JAIL MFS partition created
Boot media /dev/ad0 has dual root support


WARNING: JUNOS versions running on dual partitions are not same


** /dev/ad0s1a
FILE SYSTEM CLEAN; SKIPPING CHECKS
clean, 242154 free (34 frags, 30265 blocks, 0.0% fragmentation)

You can also find out from operational mode. The command below will show you the software versions on both of the partitions.

netprobe@netfilter> show system snapshot media internal
Information for snapshot on       internal (/dev/ad0s1a) (backup)
Creation date: Jul 31 11:13:12 2014
JUNOS version on snapshot:
  junos  : 12.1X44-D35.5-domestic
Information for snapshot on       internal (/dev/ad0s2a) (primary)
Creation date: Mar 4 19:53:11 2016
JUNOS version on snapshot:
  junos  : 12.1X46-D40.2-domestic

As you can see from the output, we are currently running on partition /dev/ad0s2a, which has a newer software version than the first partition /dev/ad0s1a.

Cloning the primary partition

To get the software version from our now-primary partition over to the backup, the system will first format the backup partition and then clone the contents. This process is initiated with the command below.

netprobe@netfilter> request system snapshot slice alternate
Formatting alternate root (/dev/ad0s1a)...
Copying '/dev/ad0s2a' to '/dev/ad0s1a' .. (this may take a few minutes)
The following filesystems were archived: /

After the cloning process you might need to reboot the device, depending on the model. If all went well, you will no longer see the warning prompt for version mismatch:

Creating JAIL MFS partition...
JAIL MFS partition created
Boot media /dev/ad0 has dual root support
** /dev/ad0s1a
FILE SYSTEM CLEAN; SKIPPING CHECKS

And voila, the snapshot command will now show the same SW version on both partitions.

netprobe@netfilter> show system snapshot media internal
Information for snapshot on       internal (/dev/ad0s1a) (backup)
Creation date: Mar 4 21:45:37 2016
JUNOS version on snapshot:
  junos  : 12.1X46-D40.2-domestic
Information for snapshot on       internal (/dev/ad0s2a) (primary)
Creation date: Mar 4 21:48:51 2016
JUNOS version on snapshot:
  junos  : 12.1X46-D40.2-domestic

On EX switches, you can alternate between boot partitions by entering this command.

request system reboot slice alternate media internal

Unfortunately this doesn’t seem to work on SRX devices, at least not on the branch devices I’ve worked with so far. If anyone knows how to make this work on these SRXs I would love to hear about it!

JNCIS-SEC – Exam JN0-332 passed!

Can’t believe it has already been two months since I last posted on my blog… The last two weeks of October I had to go into full-on study mode for the JN0-332 exam so I had to pause on the write-ups for a while. Fortunately the hard work paid off and after a good six months of study and labbing, with a hot summer in between, I sat the JN0-332 on the 26th of October and passed it with 83 percent.

The exam itself was very fair and covered all the topics on the blueprint and in the freely available Juniper Study Guides. I have sat a few “other vendor” exams where multiple questions caught me off-guard and were definitely not included in the related cert guides. Not really the case with this Juniper exam though, so hats off to Juniper for delivering a consistent test.

Another great thing is that Juniper lets you go back through the questions when you’re done. You can also mark questions, which you’re uncertain about, for review. When you’re done, and if you still have some time on the counter, you can run through it once more to fill in the gaps. Or perhaps one of the later questions made you rethink your answer on an earlier question. I’m certain this approach will earn test-takers some points.

The Score Report is pretty straight-forward. It breaks the exam down into the main topics, the total number of questions, number of correct answers and your total percentage. My worst scores were on Firewall User Authentication, which to be fair only had a couple of questions, and UTM, which I couldn’t fully simulate in my SRX100 and vSRX lab. Fair to say I wasn’t surprised with the outcome, but still very satisfied overall.

For anyone interested in the JNCIS-SEC certification, here is what I used to pass the exam…

JNCIS-SEC Study Guides

Probably the most important source of information for this exam, Juniper is offering free Study Guides for all Specialist tracks on their site. You will need an account but registration is free so there really is no excuse. Follow this link

The first PDF covers SRX basics, Security Zones, Policies, User Authentication, NAT, IPsec and Clustering. The second PDF is dedicated to the UTM features.

Juniper SRX Series – By Brad Woodberg, Rob Cameron

Juniper SRX Series - O'Reilly

I bought this book early on, when I first encountered the SRX at a new job. Weighing in at about 1000 pages, it’s the perfect reference for anyone dealing with the SRX on a daily basis. It’s not something you’ll read front to back though, and I’ve found myself reading through the chapters for whatever feature I need at a certain point. For example, the chapter on Screens gives you an in-depth review of each of the features, the attacks for which they were written, and so on. Highly recommended! I’ve also found that you can read the book online

Juniper vSRX – Firefly

The virtual edition of the SRX firewall. You can run a trial version on your Hypervisor and even try the Advanced Features for 60 days. More information here.

Three SRX100Bs

I bought three of these boxes for cheap on eBay. They don’t have the high memory so don’t support UTM or IPS, but they are great for configurations that are hard to do on the virtual appliance, like aggregated interfaces and clustering. Added bonus is that they also support routing and switching so they can be used for the ENT track also.

Junos Genius app

The official Juniper app for JNCIA and Specialist level exams. Whenever I had a few minutes off I would go through some practice questions. Very good to keep your mind on the content and memorise some of the technical details. I just wish they had a PC edition! 🙂

JUNOS GENIUS – Android version
JUNOS GENIUS – IOS version

Next challenge?

For a few weeks I was working on the Brocade Certified vRouter Engineer certification. I already touched some Vyatta/VyOS routers so figured I might as well try the free exam. Unfortunately, when I tried to book the exam, I found that the Brocade voucher had expired. I tried mailing Brocade but to no avail – they confirmed that the promotion was no longer running. That sort of halted the BCVRE endeavour…

For a good two weeks now I’ve been going through the JNCIS-ENT study guides. Bought myself a couple EX switches which will be complementing the SRX and vSRX I already have in the lab. It should also be a good refresh of the routing & switching topics as my CCNP is up for renewal in December 2016. As I did for JNCIS-SEC, I will be writing up my ENT labs on the blog.

If you are interested in the JNCIS-SEC certification, and have any other questions, feel free to post them in the comments section. Thank you for reading!

Allowing inbound DHCP requests on a Cisco ZBFW

I came across an interesting one today, where a Cisco Zone-Based Firewall needed to be reconfigured to serve DHCP for a segment connected to it in a zone called “Guest”. It already had a policy-map configured for traffic from Guest to Self, which had ACLs for SSH management.

First I tried adding these two lines to that ACL, in the existing class-map

 permit udp any any eq bootpc
 permit udp any any eq bootps

Although I did see the ACL match counters increment, DHCP was not handing out addresses yet. A quick search led me to this page on the Cisco site. In the last paragraph, they state the following:

If the routers’ inside interface is acting as a DHCP server and if the clients that connect to the inside interface are the DHCP clients, this DHCP traffic is allowed by default if there is no inside-to-self or self-to-inside zone policy.
However, if either of those policies does exist, you need to configure a pass action for the traffic of interest (UDP port 67 or UDP port 68) in the zone pair service policy.

In my case, there was a policy configured but with the action set to inspect. To fix it, I had to add a new ACL and class-map to the Guest-Self policy-map.

New ACL that matches the DHCP traffic. The source and destination is set to any because of the DHCP request format.

ip access-list extended Guest-Self-DHCP-ACL
 permit udp any any eq bootpc
 permit udp any any eq bootps

Tie the ACL to a new inspect class map:

class-map type inspect match-any Guest-Self-DHCP-CMap
 match access-group name Guest-Self-DHCP-ACL

And finally, add the class-map to the policy-map with the pass action

policy-map type inspect Guest-Self-PMap
 class type inspect Guest-Self-CMap
  inspect
 class type inspect Guest-Self-DHCP-CMap
  pass log
 class class-default
  drop

After that the clients started receciving IP addresses again.

ZBFW-ROUTER#show ip dhcp binding
Bindings from all pools not associated with VRF:
IP address          Client-ID/              Lease expiration        Type
                    Hardware address/
                    User name
192.168.200.201     014d.970e.4136.af       Oct 21 2015 10:43 AM    Automatic

Juniper SRX Clustering with LACP

Most deployment guides for SRX clusters out there focus on standard two-port deployments, where you have one port in, one port out and a couple of cluster links that interconnect and control the cluster. Unfortunately, in that design, one simple link failure will usually make the cluster fail over. Coming from the R&S realm, I am very careful when it comes to physical redundancy so I wanted to figure out a way to get this working with Etherchannels.

After a lot of reading and some trial and error, I ended up with a working solution. Probably not perfect, but definitely more redundant. So, in this post we will get the topology below configured and afterwards do some failover testing.

Physical connections

Basic Cluster Setup

The commands below will get the basic cluster up and running, assuming you have already configured the cluster and node ids.

set system root-authentication encrypted-password "$1$FQl4d.NC$l25c0bDGr5aPq9ZHx0R.S."
set groups node0 system host-name FW01A
set groups node1 system host-name FW01B
set apply-groups "${node}"
set chassis cluster reth-count 2
set chassis cluster redundancy-group 0 node 0 priority 120
set chassis cluster redundancy-group 0 node 1 priority 1
set chassis cluster redundancy-group 1 node 0 priority 120
set chassis cluster redundancy-group 1 node 1 priority 1
set chassis cluster redundancy-group 1 preempt
set interfaces fab0 fabric-options member-interfaces fe-0/0/5
set interfaces fab1 fabric-options member-interfaces fe-1/0/5

Layer3 Etherchannel configuration on the SRX

The physical ports will be bundled in two reth interface. Per reth interface, we will add two physical ports per cluster node, which yields a total of four ports. However only two links will ever be actively forwarding traffic – those on the active RG.

Physical connections

It is important that the links from every cluster nodes are terminated on separate etherchannels on the switches. Otherwise the switches would also load-balance the traffic over the two non-forwarding ports, as documented here…
In my topology, both switches have a Port-Channel 11 and 12, going to their own cluster node.

{primary:node0}
root@FW01A> show configuration interfaces
fe-0/0/0 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-0/0/1 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-0/0/2 {
    fastether-options {
        redundant-parent reth1;
    }
}
fe-0/0/3 {
    fastether-options {
        redundant-parent reth1;
    }
}
fe-1/0/0 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-1/0/1 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-1/0/2 {
    fastether-options {
        redundant-parent reth1;
    }
}
fe-1/0/3 {
    fastether-options {
        redundant-parent reth1;
    }
}

Pay special attention to the configuration of the reth interfaces. First, we tie the redundant interfaces to our redundancy-group 1, in which we will later control the failover conditions.

The minimum-links command determines how many interfaces must be up before the LACP bundle is considered up. Although we have two interfaces per cluster, we still want traffic forwarded in a worst-case scenario. Setting minimum-links 1 will keep the Etherchannel up even in the unlikely event of having three physical ports down.

reth0 {
    vlan-tagging;
    redundant-ether-options {
        redundancy-group 1;
        minimum-links 1;
        lacp {
            active;
            periodic fast;
        }
    }
    unit 111 {
        vlan-id 111;
        family inet {
            address 1.1.1.1/28;
        }
    }
}
reth1 {
    vlan-tagging;
    redundant-ether-options {
        redundancy-group 1;
        minimum-links 1;
        lacp {
            active;
            periodic fast;
        }
    }
    unit 255 {
        vlan-id 255;
        family inet {
            address 10.255.255.254/28;
        }
    }
}

By adding the vlan-tagging command, and adding a logical subinterface (unit 255) with vlan-id 255 specified, we are creating a tagged L3 Etherchannel. In other words, the SRX is expecting packets for sub-interface reth1.255 to be tagged with a dot1q value of 255. The unit number can be any number you like, but it’s best to keep the unit number and the dot1q value aligned – for your own sanity! 🙂

Security Zones and Policies

To check basic connectivity and later run some failover tests, I have added the following configuration for the Security Zones, Policies and Source NAT. In a production environment, you probably won’t be allowing ping.

root@FW01A# show zones
security-zone trust {
    address-book {
        address Net-10.255.255.240-28 10.255.255.240/28;
    }
    host-inbound-traffic {
        system-services {
            ping;
        }
    }
    interfaces {
        reth1.255;
    }
}
security-zone untrust {
    address-book {
        address Net-1.1.1.0-28 1.1.1.0/28;
    }
    interfaces {
        reth0.111 {
            host-inbound-traffic {
                system-services {
                    ping;
                }
            }
        }
    }
}
{primary:node0}[edit security]
root@FW01A# show policies
from-zone trust to-zone untrust {
    policy Test-Policy {
        match {
            source-address Net-10.255.255.240-28;
            destination-address Net-1.1.1.0-28;
            application any;
        }
        then {
            permit;
        }
    }
}
{primary:node0}[edit security]
root@FW01A# show nat
source {
    rule-set SNAT-Trust-to-Untrust {
        from zone trust;
        to zone untrust;
        rule HideNAT-1 {
            match {
                source-address 10.255.255.240/28;
            }
            then {
                source-nat {
                    interface;
                }
            }
        }
    }
}

Preparing the switches for L2

I’m running two Cisco 3550 switches as my layer3 core switches. For now, I will just add the Layer 1 and 2 stuff.

vlan 111
 name untrust
!
vlan 255
 name transit


interface range fastEthernet 0/1 - 2
 channel-group 11 mode active

interface range fastEthernet 0/3 - 4
 channel-group 12 mode active

int po 11
 switchport trunk encapsulation dot1q
 switchport mode trunk
 switchport nonegotiate
 switchport trunk allowed vlan 111

int po 12
 switchport mode trunk
 switchport nonegotiate
 switchport trunk allowed vlan 255

For CS2, we can copy paste the exact same config.

A second trunk link is added between the core switches, which will carry inter-switch traffic for VLAN 111 and 255.

interface FastEthernet0/23
 channel-group 1 mode active
!
interface FastEthernet0/24
 channel-group 1 mode active

...

interface Port-channel1
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 111,255
 switchport mode trunk
 switchport nonegotiate

The LACP configuration is now complete, and after cabling everything up we see the Port-channels are in the bundled state:

NP-CS1>show etherchannel summary
Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port


Number of channel-groups in use: 3
Number of aggregators:           3

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
1      Po1(SU)         LACP      Fa0/23(P)   Fa0/24(P)
11     Po11(SU)        LACP      Fa0/1(P)    Fa0/2(P)
12     Po12(SU)        LACP      Fa0/3(P)    Fa0/4(P)

The SRX is bit more detailed with the information in the lacp command

root@FW01A> show lacp interfaces
Aggregated interface: reth0
    LACP state:       Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  Activity
      fe-0/0/0       Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      fe-0/0/0     Partner    No    No   Yes  Yes  Yes   Yes     Slow    Active
      fe-0/0/1       Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      fe-0/0/1     Partner    No    No   Yes  Yes  Yes   Yes     Slow    Active
      fe-1/0/0       Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      fe-1/0/0     Partner    No    No   Yes  Yes  Yes   Yes     Slow    Active
      fe-1/0/1       Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      fe-1/0/1     Partner    No    No   Yes  Yes  Yes   Yes     Slow    Active
    LACP protocol:        Receive State  Transmit State          Mux State
      fe-0/0/0                  Current   Slow periodic Collecting distributing
      fe-0/0/1                  Current   Slow periodic Collecting distributing
      fe-1/0/0                  Current   Slow periodic Collecting distributing
      fe-1/0/1                  Current   Slow periodic Collecting distributing

Layer3 Switch Configuration

Now that we have our layer 2 connectivity, we can move on to the IP addressing. On the L3 switches, I will configure the SVIs in the transit network, and run HSRP between them with hello/hold timers of 1 and 3 seconds. This would allow for a reasonable failover time in case of an outage.

NP-CS1:

interface Vlan255
 ip address 10.255.255.242 255.255.255.240
 standby 255 ip 10.255.255.241
 standby 255 timers 1 3
 standby 255 priority 110
 standby 255 preempt

NP-CS2:

NP-CS2#sh run int vlan 255
!
interface Vlan255
 ip address 10.255.255.243 255.255.255.240
 standby 255 ip 10.255.255.241
 standby 255 timers 1 3
 standby 255 preempt

The VLAN interfaces come up immediately and as a final test for basic L3 connectivity, we try pinging the SRX from the core switch.

NP-CS1#ping 10.255.255.254 repeat 10

Type escape sequence to abort.
Sending 10, 100-byte ICMP Echos to 10.255.255.254, timeout is 2 seconds:
!!!!!!!!!!
Success rate is 100 percent (10/10), round-trip min/avg/max = 1/2/4 ms

Checking the ARP table, we can see the Cluster MAC address, with the 4th last digit (1) being the cluster ID and the last two (01) the reth number.

NP-CS1#show ip arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  10.255.255.254          1   0010.dbff.1001  ARPA   Vlan255
Internet  10.255.255.242          -   0012.7f81.8f00  ARPA   Vlan255
Internet  10.255.255.243          8   000d.ed6f.9680  ARPA   Vlan255
Internet  10.255.255.241          -   0000.0c07.acff  ARPA   Vlan255

The default route on both core switches is pointed to 10.255.255.254

ip route 0.0.0.0 0.0.0.0 10.255.255.254

Internet segment configuration

I have connected two routers to the fa0/5 interfaces of the switches and added them to vlan 111. They are running HSRP with a VIP of 1.1.1.14 and hello/hold timers of 1 and 3 seconds.

ISP1#show standby brief
                     P indicates configured to preempt.
                     |
Interface   Grp  Pri P State   Active          Standby         Virtual IP
Fa0/1       111  110   Active  local           1.1.1.13        1.1.1.14

Running a simple ping to test connectivity between the firewalls and the ISP router.

root@FW01A> ping 1.1.1.14 count 1
PING 1.1.1.14 (1.1.1.14): 56 data bytes
64 bytes from 1.1.1.14: icmp_seq=0 ttl=255 time=2.981 ms

*snip*

To test if traffic is being forwarded across the firewall, we finally try reaching the ISP routers from the core switches.

NP-CS1#ping 1.1.1.14 source vlan 255

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.14, timeout is 2 seconds:
Packet sent with a source address of 10.255.255.242
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms

After many bits of configuration, our cluster is live so we can start implementing some failover mechanisms.

Configuring the interface-monitor

When the interface-monitor is configured, it sets a threshold value of 255 to each redundancy group. Each interface is assigned a custom weight for the redundancy group, which is detracted from the threshold if the interface goes physically down. When the threshold reaches zero, the redundancy group and all its objects fail over.

Note – a commmon mistake (trust me) is to mix up the interface monitor weights with the RG priorities. These are two separate values, the interface monitor threshold is always 255 by default.

In my lab scenario, I will give each port a value of 128. Two physical links down will trigger a failover to node 1. This is the final configuration for redundancy group 1:

{primary:node0}[edit chassis cluster]
root@FW01A# show
reth-count 2;
redundancy-group 0 {
    node 0 priority 120;
    node 1 priority 1;
}
redundancy-group 1 {
    node 0 priority 120;
    node 1 priority 1;
    preempt;
    interface-monitor {
        fe-0/0/0 weight 128;
        fe-0/0/1 weight 128;
        fe-1/0/0 weight 128;
        fe-1/0/1 weight 128;
        fe-0/0/2 weight 128;
        fe-0/0/3 weight 128;
        fe-1/0/3 weight 128;
        fe-1/0/2 weight 128;
    }

Failover scenarios

  • First, we will physically disconnect fe-0/0/0 which should not impact our regular traffic flow.

Disconnecting fe-0/0/0/

{primary:node0}
root@FW01A> show interfaces terse | match fe-0/0/0
fe-0/0/0                up    down
fe-0/0/0.111            up    down aenet    --> reth0.111
fe-0/0/0.32767          up    down aenet    --> reth0.32767

The JSRPD daemon logged the following entries in which we can see it decrement the value of 128 of the global threshold of 255.

root@FW01A> show log jsrpd | last | match 19:30
Oct  6 19:30:31 Interface fe-0/0/0 is going down
Oct  6 19:30:31 fe-0/0/0 interface monitored by RG-1 changed state from Up to Down
Oct  6 19:30:31 intf failed, computed-weight -128
Oct  6 19:30:31 LED changed from Green to Amber, reason is Monitored objects are down
Oct  6 19:30:31 Current threshold for rg-1 is 127. Failures: interface-monitoring
Oct  6 19:30:42 printing fpc_num h0
Oct  6 19:30:42 jsrpd_ifd_msg_handler: Interface reth0 is up
Oct  6 19:30:42 reth0 from  jsrpd_ssam_reth_read reth_rg_id=1

The chassis cluster interfaces also shows us the interface as down and how much weight it had:

root@FW01A> show chassis cluster interfaces | find Monitoring
Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    fe-1/0/2          128       Up        1
    fe-1/0/3          128       Up        1
    fe-0/0/3          128       Up        1
    fe-0/0/2          128       Up        1
    fe-1/0/1          128       Up        1
    fe-1/0/0          128       Up        1
    fe-0/0/1          128       Up        1
    fe-0/0/0          128       Down      1

Because our threshold value is still at 127 (255 – 128), RG1 is still active on node0 and no failover event was triggered.

Redundancy group: 1 , Failover count: 0
    node0                   120         primary        yes      no
    node1                   1           secondary      yes      no
  • As a second test, we will also unplug fe-0/0/1. Without the interface monitoring, this would halt the forwarding of traffic to the untrust zone as reth0 would have no more interfaces on node0.

Disconnecting fe-0/0/1 as well

To get a basic idea of the failover time, I’m running a ping test from the core switch.

Sending 1000, 100-byte ICMP Echos to 1.1.1.14, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
*snip*

We lost one packet during the failover with a timeout of two seconds, which is not bad at all. The switchover is stateful so TCP connections should be able to handle this just fine. Let’s see what happened on the SRX…

Our interface monitor shows two interfaces down that had a weight of 128. The threshold should now have reached zero which retired node0 from the cluster.

root@FW01A> show chassis cluster interfaces | find Monitoring
Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    fe-1/0/2          128       Up        1
    fe-1/0/3          128       Up        1
    fe-0/0/3          128       Up        1
    fe-0/0/2          128       Up        1
    fe-1/0/1          128       Up        1
    fe-1/0/0          128       Up        1
    fe-0/0/1          128       Down      1
    fe-0/0/0          128       Down      1

Redundancy group 1 is now primary on node1. The routing engine is still active on node0.

root@FW01A> show chassis cluster status
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 0 , Failover count: 0
    node0                   120         primary        no       no
    node1                   1           secondary      no       no

Redundancy group: 1 , Failover count: 1
    node0                   0           secondary      yes      no
    node1                   1           primary        yes      no

And the JSRPD logged documented the entire event:

Oct  6 19:43:01 Interface fe-0/0/1 is going down
Oct  6 19:43:01 fe-0/0/1 interface monitored by RG-1 changed state from Up to Down
Oct  6 19:43:01 intf failed, computed-weight -256
Oct  6 19:43:01 RG(1) priority changed on node0 120->0 Priority is set to 0, Monitoring objects are down
Oct  6 19:43:01 Successfully sent an snmp-trap due to priority change from 120 to 0 on RG-1 on cluster 1 node 0. Reason: Priority is set to 0, Monitoring objects are down
Oct  6 19:43:01 Current threshold for rg-1 is -1. Setting priority to 0. Failures: interface-monitoring
Oct  6 19:43:01 Both the nodes are primary. RG-1 PRIMARY->SECONDARY_HOLD due to preempt/yield, my priority 0 is worse than other node's priority 1
Oct  6 19:43:01 Successfully sent an snmp-trap due to a failover from primary to secondary-hold on RG-1 on cluster 1 node 0. Reason: Monitor failed: IF
Oct  6 19:43:01 updated rg_info for RG-1 with failover-cnt 1 state: secondary-hold into ssam. Result = success, error: 0
Oct  6 19:43:01 reth0 ifd state changed from node0-primary -> node1-primary for RG-1
Oct  6 19:43:01 reth1 ifd state changed from node0-primary -> node1-primary for RG-1
Oct  6 19:43:01 updating primary-node as node1 for RG-1 into ssam. Previous primary was node0. Result = success, 0
Oct  6 19:43:01 Successfully sent an snmp-trap due to a failover from primary to secondary-hold on RG-1 on cluster 1 node 0. Reason: Monitor failed: IF
Oct  6 19:43:01 printing fpc_num h0
Oct  6 19:43:01 jsrpd_ifd_msg_handler: Interface reth0 is up
Oct  6 19:43:01 reth0 from  jsrpd_ssam_reth_read reth_rg_id=1
Oct  6 19:43:01 printing fpc_num h1
Oct  6 19:43:01 jsrpd_ifd_msg_handler: Interface reth1 is up
Oct  6 19:43:01 reth1 from  jsrpd_ssam_reth_read reth_rg_id=1
Oct  6 19:43:02 SECONDARY_HOLD->SECONDARY due to back to back failover timer expiry for RG-1
Oct  6 19:43:02 Successfully sent an snmp-trap due to a failover from secondary-hold to secondary on RG-1 on cluster 1 node 0. Reason: Back to back failover interval expired
Oct  6 19:43:02 updated rg_info for RG-1 with failover-cnt 1 state: secondary into ssam. Result = success, error: 0
Oct  6 19:43:02 Successfully sent an snmp-trap due to a failover from secondary-hold to secondary on RG-1 on cluster 1 node 0. Reason: Back to back failover interval expired

What is interesting is that the SRX is explicitly sending an SNMP trap for these events, so make sure you have a good monitoring tool and trap receiver in place, preferably with alerting.

  • As the ultimate test, I will unplug interface fe-1/0/0, the first port of node1, which will make the whole LACP bundle reth0 run on just one link. If we had configured minimum-links 2 this action would bring down the whole reth bundle

We are riding on one wheel now!

After pulling the cable, reth0 is strolling along on just one link:

root@FW01A> show lacp interfaces reth0 | find protocol
    LACP protocol:        Receive State  Transmit State          Mux State
      fe-0/0/0            Port disabled     No periodic           Detached
      fe-0/0/1            Port disabled     No periodic           Detached
      fe-1/0/0            Port disabled     No periodic           Detached
      fe-1/0/1                  Current   Slow periodic Collecting distributing

And the Etherchannel is still up:

root@FW01A> show interfaces terse | match reth0
reth0                   up    up
reth0.111               up    up   inet     1.1.1.1/28
reth0.32767             up    up
root@FW01A> show chassis cluster interfaces | find Monitoring
Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    fe-1/0/2          128       Up        1
    fe-1/0/3          128       Up        1
    fe-0/0/3          128       Up        1
    fe-0/0/2          128       Up        1
    fe-1/0/1          128       Up        1
    fe-1/0/0          128       Down      1
    fe-0/0/1          128       Down      1
    fe-0/0/0          128       Down      1

For anyone designing a network solution with high-availability in mind, this is all very promising. Even with 75% of our physical links down, the reth will stay functional and forward traffic.

I have now reconnected all cabling for a last question I’ve been pondering about – how does the interface monitor react when physical ports go down on both nodes? Suppose we are running our trunks to a chassis or VSS and lose a linecard or stack-member? Does the interface weight count globally and trigger a failover? Let’s find out…

  • Disconnecting fe-0/0/0 and fe-1/0/0, the first port on each node which is in reth0
root@FW01A> show interfaces terse | match "fe-0/0/0|fe-1/0/0"
fe-0/0/0                up    down
fe-0/0/0.111            up    down aenet    --> reth0.111
fe-0/0/0.32767          up    down aenet    --> reth0.32767
fe-1/0/0                up    down
fe-1/0/0.111            up    down aenet    --> reth0.111
fe-1/0/0.32767          up    down aenet    --> reth0.32767

Both links, which both had a weight of 128, are now down but RG1 is still rolling along fine on node0 and the priority is not set to 0. This does prove that the interface-monitor value is tied to the node on which the interface resides.

Redundancy group: 1 , Failover count: 2
    node0                   120         primary        yes      no
    node1                   1           secondary      yes      no

Let’s disconnect fe-0/0/2 and fe-1/0/2 to simulate a really bad day. Our interface monitor now shows half our revenue ports down:

Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    fe-1/0/2          128       Down      1
    fe-1/0/3          128       Up        1
    fe-0/0/3          128       Up        1
    fe-0/0/2          128       Down      1
    fe-1/0/1          128       Up        1
    fe-1/0/0          128       Down      1
    fe-0/0/1          128       Up        1
    fe-0/0/0          128       Down      1

Interestingly though, both nodes have their priorities now set to zero, but node0 is still primary for RG1.

Redundancy group: 0 , Failover count: 0
    node0                   120         primary        no       no
    node1                   1           secondary      no       no

Redundancy group: 1 , Failover count: 2
    node0                   0           primary        yes      no
    node1                   0           secondary      yes      no

Is there still traffic being sent through though? Running a quick ping from CS2:

NP-CS1#ping 1.1.1.14

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.14, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/3/4 ms

Conclusion

This turned into quite a lengthy post but hopefully demonstrated the benefits of having an Etherchannel over the standard single-port configurations you’ll find in most documentation. Bundling your physical interfaces in a LAG gives you that extra layer of physical redundancy with the added benefit of load sharing on the links. We were able to “lose” 75 percent of our revenue ports without any significant impact. Adding the tagged Layer3 interfaces also gives us the option to add more logical units in the future, which can in turn be assigned to their own routing-instances and zones if you’re dealing with a large-scale or multi-tenant environment.

If this was helpful for you or if you have a remark, please let me know below in the comments! Thanks for reading 🙂

External links that added to my understanding of this topic

Failure is optional – Reth Interfaces and LACP
Juniper KB – Link aggregation (LACP) supported/non-supported configurations on SRX and EX
The free JNCIS-SEC Study Guides at Juniper.net

Outage Report – SRX ALG failure – ‘application failure or action’ logs

The initial complaint came from one of our branch offices, which was having issues reaching internal applications and internet sites. After some troubleshooting, we found that we could no longer query against the remote DNS server, and they could not reach their internal forwarders. First, I wrote it off as a DNS server issue but after more querying to different VPN networks it was clear that all UDP DNS traffic was being lost in transit.

To verify if the packets were going through, I started a packet capture on the LAN interface of the remote firewall and did some DNS queries. Fact, there were no UDP/53 packets arriving over the tunnel. No issues with TCP/53 though. So I enabled logging on this specific tunnel policy and while checking the logs for the DNS, I found these lines instead of the usual RT_FLOW_SESSION_CREATE entries.

2015-09-25 14:19:53	User.Info	10.164.3.70	1 2015-09-25T14:19:53.047 VPN_Box_A RT_FLOW - RT_FLOW_SESSION_CLOSE [junos@2636.1.1.1.2.36 reason="application failure or action" source-address="10.164.242.1" source-port="58358" destination-address="10.10.132.40" destination-port="53" service-name="junos-dns-udp" nat-source-address="10.164.242.1" nat-source-port="58358" nat-destination-address="10.10.132.40" nat-destination-port="53" src-nat-rule-name="None" dst-nat-rule-name="None" protocol-id="17" policy-name="FW-Policy100" source-zone-name="trust" destination-zone-name="untrust" session-id-32="51106" packets-from-client="0" bytes-from-client="0" packets-from-server="0" bytes-from-server="0" elapsed-time="1" application="UNKNOWN" nested-application="UNKNOWN" username="N/A" roles="N/A" packet-incoming-interface="reth0.0" encrypted="No "]

This is not a standard event: RT_FLOW_SESSION_CLOSE [junos@2636.1.1.1.2.36 reason=”application failure or action”

When going through the entire log file for other entries with the “application failure or action” message, I found many more related to RPC and FTP. This immediately pointed to the ALG engine.

2015-09-25 14:19:54	User.Info	10.164.3.70	1 2015-09-25T14:19:53.846 VPN_Box_A RT_FLOW - RT_FLOW_SESSION_CLOSE [junos@2636.1.1.1.2.36 reason="application failure or action" source-address="10.164.110.223" source-port="9057" destination-address="10.104.12.161" destination-port="21" service-name="junos-ftp" nat-source-address="10.9.1.150" nat-source-port="58020" nat-destination-address="10.xx.70.1" nat-destination-port="21" src-nat-rule-name="SNAT-Policy5" dst-nat-rule-name="NAT-Policy10" protocol-id="6" policy-name="FW-FTP" source-zone-name="trust" destination-zone-name="untrust" session-id-32="24311" packets-from-client="0" bytes-from-client="0" packets-from-server="0" bytes-from-server="0" elapsed-time="1" application="UNKNOWN" nested-application="UNKNOWN" username="N/A" roles="N/A" packet-incoming-interface="reth0.0" encrypted="No "]

2015-09-25 14:19:51	User.Info	10.164.3.70	1 2015-09-25T14:19:51.444 VPN_Box_A RT_FLOW - RT_FLOW_SESSION_CLOSE [junos@2636.1.1.1.2.36 reason="application failure or action" source-address="10.164.243.110" source-port="50801" destination-address="10.10.132.50" destination-port="135" service-name="junos-ms-rpc-tcp" nat-source-address="10.164.243.110" nat-source-port="50801" nat-destination-address="10.10.132.50" nat-destination-port="135" src-nat-rule-name="None" dst-nat-rule-name="None" protocol-id="6" policy-name="FW-Policy100" source-zone-name="trust" destination-zone-name="untrust" session-id-32="2317" packets-from-client="0" bytes-from-client="0" packets-from-server="0" bytes-from-server="0" elapsed-time="1" application="UNKNOWN" nested-application="UNKNOWN" username="N/A" roles="N/A" packet-incoming-interface="reth0.0" encrypted="No "]

Coincidentally, we had already received a ticket related to FTP traffic over a different VPN, but the tunnel was up and all other services were open, so we didn’t immediately relate the two cases.

As a quick workaround for DNS, I disabled the DNS ALG. If you want to know what the ALG exactly does, Bart Jansens has a good write-up here. We could live without those features for a while.

{primary:node0}
netprobe@VPN_Box_A> show configuration security alg dns 
disable;

After disabling the ALG, all our DNS queries were going through again. Instead of immediately closing the session when a response is received, the DNS sessions get a 60 second timeout. This gave a few more sessions in the flow table but nothing the SRX couldn’t manage.

The same workaround worked for FTP. Unfortunately, I couldn’t find a quick way to restart the ALG process so I have rebooted the cluster nodes over the weekend. After rebooting, all the services inspected by the ALGs worked as designed again.

*Hostnames, IP addresses and Policy names were changed

JNCIS-SEC – Juniper SRX100 Cluster Configuration

In this post I will go through the basics of cluster configuration on the SRX. I still have a couple of SRX100s laying around, which is perfect to cover the clustering topics of the JNCIS-SEC blueprint!

Before you start configuring the cluster, always verify that both your boxes are on the same software version.

root@FW01A> show system software
Information for junos:

Comment:
JUNOS Software Release [12.1X44-D40.2]

Physical Wiring

Here is how the cluster will be cabled up. Because there’s a lot to remember during the configuration, it’s best to make this sort of diagram before you begin.

SRX100B Cluster Wiring

Connect the fxp1 and Fab ports

The Control link (fxp1) is used to synchronize configuration and performs cluster health checks by sending heartbeat messages. The physical port location depends on the SRX model, and can also be configurable on the high-end models. In my case, on the branch SRX100B, the fe0/0/07 interfaces are predetermined as fxp1.

The fab interface is used to exchange all the session state information between both devices. This provides a stateful failover if anything happens to the primary cluster node. You can choose which interface to assign. I will use fe/0/0/5 so all the first ports stay available.

Setting the Cluster-ID and Node ID

First, wipe all the old configuration and put both devices in cluster mode. Some terminology:

  • The cluster ID ranges from 1 to 15 and uniquely identifies the cluster if you have multiple clusters across the network. I will use Cluster ID 1
  • The node ID identifies both members in the cluster. A cluster will only have two members ever, so the options are 0 and 1

The commands below are entered in operational mode:

root@FW01A> set chassis cluster cluster-id 1 node 0 reboot
Successfully enabled chassis cluster. Going to reboot now
root@FW01B> set chassis cluster cluster-id 1 node 1 reboot
Successfully enabled chassis cluster. Going to reboot now

Keep attention when you enter the commands above. Make sure you are actually enabling the cluster, not disabling it. That would return the following message:

 Successfully disabled chassis cluster. Going to reboot now

Configuring the management interfaces

Once the devices have restarted we can move on to the configuration part.

To get out-of-band access to your firewalls, you really should configure both the members with a managment IP on the fxp0 interface.
All member-specific configuration is applied under the groups node-memmbers stanza. This is also where the hostnames are configured.

{primary:node0}[edit groups]
root@FW01A# show
node0 {
    system {
        host-name FW01A;
    }
    interfaces {
        fxp0 {
            unit 0 {
                family inet {
                    address 192.168.1.1/24;
                }
            }
        }
    }
}
node1 {
    system {
        host-name FW01B;
    }
    interfaces {
        fxp0 {
            unit 0 {
                family inet {
                    address 192.168.1.2/24;
                }
            }
        }
    }
}

On the SRX100B, the fxp0 interface is automatically mapped to the fe0/0/6 interface. Be sure to check the documentation for your specific model.

Apply Group

Before committing, don’t forget to include the command below . This ensures that node-specific config is only applied to that particular node.

{primary:node0}[edit]
root@FW01A# set apply-groups "${node}"

Configuring the fabric interface

The next step is to configure your fabric links, which are used to exchange the session state. Node0 has the fab0 interface and Node1 has the fab1 interface.

{primary:node0}[edit interfaces]
root@FW01A# show
fab0 {
    fabric-options {
        member-interfaces {
            fe-0/0/5;
        }
    }
}
fab1 {
    fabric-options {
        member-interfaces {
            fe-1/0/5;
        }
    }
}

After a commit, we can see both the control and fabric links are up.

root@FW01A# run show chassis cluster interfaces
Control link status: Up

Control interfaces:
    Index   Interface        Status
    0       fxp1             Up

Fabric link status: Up

Fabric interfaces:
    Name    Child-interface    Status
                               (Physical/Monitored)
    fab0    fe-0/0/5           Up   / Up
    fab0
    fab1    fe-1/0/5           Up   / Up
    fab1

Redundant-pseudo-interface Information:
    Name         Status      Redundancy-group
    lo0          Up          0

Configuring the Redundancy Groups

The redundancy group is where you configure the cluster’s failover properties relating to a collection of interfaces or other objects. RG0 is configured by default when you activate the cluster, and manages the redundancy for the routing engines. Let’s create a new RG 1 for our interfaces.

{secondary:node0}[edit chassis]
root@FW01A# show
cluster {
    redundancy-group 0 {
        node 0 priority 100;
        node 1 priority 1;
    }
    redundancy-group 1 {
        node 0 priority 100;
        node 1 priority 1;
    }
}

Configuring Redundant Ethernet interfaces

The reth interfaces are bundles of physical ports across both cluster members. The child interfaces inherit the configuration from the overlying reth interface – think of it as being similar to an 802.3ad Etherchannel. In fact, you can use an Etherchannel to use more than one physical port on each node.

{secondary:node0}[edit chassis cluster]
root@FW01A# set reth-count 2

After entering this command, you can do a quick commit, which will make the reth interfaces visible in the terse command.

root@FW01A# run show interfaces terse | match reth
reth0                   up    down
reth1                   up    down

Now you can configure the reth interfaces as you would with any other interface, give them an IP and assign them to the Redundancy Group.

reth0 is our outside interface, and reth1 is the inside.

{secondary:node0}[edit interfaces]
root@FW01A# show reth0
redundant-ether-options {
    redundancy-group 1;
}
unit 0 {
    family inet {
        address 1.1.1.1/24;
    }
}
{secondary:node0}[edit interfaces]
root@FW01A# show reth1
redundant-ether-options {
    redundancy-group 1;
}
unit 0 {
    family inet {
        address 10.0.0.1/24;
    }
}

When our reths are configured, we can add our physical ports. The fe-0/0/0 (node0) and fe-1/0/0 (node1) will join reth0, fe-0/0/1 and fe-1/0/1 will join reth1.

{secondary:node0}[edit interfaces]
root@FW01A# show
fe-0/0/0 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-0/0/1 {
    fastether-options {
        redundant-parent reth1;
    }
}
fe-1/0/0 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-1/0/1 {
    fastether-options {
        redundant-parent reth1;
    }
}

Interface monitoring

We can use interface monitoring to subtract a predetermined priority value off our redundancy group priority, when a link goes physically down.

For example, node0 is primary for RG1 with a priority of 100. If we add an interface-monitor value of anything higher than 100 to the physical interface, the link-down event will cause to priority to drop to zero and trigger the failover. Configuration is applied at the redundancy-groups:

{primary:node0}[edit chassis cluster]
root@FW01A# show
reth-count 2;
redundancy-group 0 {
    node 0 priority 100;
    node 1 priority 1;
}
redundancy-group 1 {
    node 0 priority 100;
    node 1 priority 1;
    preempt;
    gratuitous-arp-count 5;
    interface-monitor {
        fe-0/0/0 weight 255;
        fe-0/0/1 weight 255;
        fe-1/0/0 weight 255;
        fe-1/0/1 weight 255;
    }
}

Finally, we add the interfaces to security zones.

root@FW01A# show security zones
security-zone untrust {
    interfaces {
        reth0.0;
    }
}
security-zone trust {
    host-inbound-traffic {
        system-services {
            ping;
        }
    }
    interfaces {
        reth1.0;
    }
}

Verification

After cabling it up, we can verify that the cluster is fully operational.

root@FW01A> show chassis cluster status
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 0 , Failover count: 0
    node0                   100         primary        no       no
    node1                   1           secondary      no       no

Redundancy group: 1 , Failover count: 0
    node0                   100         primary        yes      no
    node1                   1           secondary      yes      no
root@FW01A> show chassis cluster interfaces
Control link status: Up

Control interfaces:
    Index   Interface        Status
    0       fxp1             Up

Fabric link status: Up

Fabric interfaces:
    Name    Child-interface    Status
                               (Physical/Monitored)
    fab0    fe-0/0/5           Up   / Up
    fab0
    fab1    fe-1/0/5           Up   / Up
    fab1

Redundant-ethernet Information:
    Name         Status      Redundancy-group
    reth0        Up          1
    reth1        Up          1

Redundant-pseudo-interface Information:
    Name         Status      Redundancy-group
    lo0          Up          0

Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    fe-1/0/1          255       Up        1
    fe-1/0/0          255       Up        1
    fe-0/0/1          255       Up        1
    fe-0/0/0          255       Up        1

Let’s see how long the failover takes, by unplugging one of the links in the trust zone.

Pinging from the inside switch to the reth1 address 10.0.0.1

theswitch#ping 10.0.0.1 repeat 10000

Type escape sequence to abort.
Sending 10000, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Excellent, only lost one ping while failing over to node1 which would be 2 seconds.

We can see that node1 is now the primary for redundancy group 1, which holds our interfaces:

root@FW01A> show chassis cluster status redundancy-group 1
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 1 , Failover count: 3
    node0                   0           secondary      yes      no
    node1                   1           primary        yes      no

{primary:node0}

In the next article, I will dive a bit deeper and integrate the SRX cluster in a real-world topology.

JNCIS-SEC – Custom IDP policies on the Juniper SRX

Activating Intrusion Detection and Prevention (or IPS, as all the cool kids call it) on the SRX is quite straightforward in itself, but turning it on and relying solely on the default IDP policies is not going to cut it. While doing my own tests, running some scriptkiddie attacks against a virtual machine, I was expecting lots of sirens but unfortunately, the SRX stayed silent. No detection and definitely no prevention. As I quickly gathered, you actually need to put some thought in the services you are exposing and the kind of attacks you can expect, and then craft your own policies with the specific signatures.

For this one, I’ll be creating a custom IDP rule on the vSRX. I’m assuming you have already set up the vSRX, activated the advanced features license and set up IDP. If not, here are some links that will put you on the right path.

Scenario

  • We are hosting an SSH/SFTP server on my public IP address of 1.1.1.10/32. Every day, there are hundreds of brute-force login attempts. Obviously, the server admin has not implemented protective measures so we will try to reduce the impact by enabling IDP.
  • Destination NAT has been configured and is translating 1.1.1.10 to internal DMZ server 10.0.4.10.

Here is the firewall configuration. IDP has already been enabled on this policy.

[edit security policies from-zone untrust to-zone dmz policy FW-SFTP-Servers]
root@NP-vSRX-01# show
match {
    source-address any;
    destination-address Grp-SFTP-Servers;
    application junos-ssh;
}
then {
    permit {
        application-services {
            idp;
        }
    }
    log {
        session-init;
    }
}

Server setup and brute-force script

I’m using this script which runs a txt file of common passwords against the username specified.

On the linux server, let’s create a user, aptly named “admin”, with one of the passwords from the text files

lab@V104-10:~$ sudo useradd admin
lab@V104-10:~$ sudo passwd admin
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully

First, running the script with the standard IDP policy of Server-Protection-1G

lab@V120-10:/tmp/brutessh$ python brutessh.py -h 1.1.1.10 -u admin -d passlist.txt

*************************************
*SSH Bruteforcer Ver. 0.2           *
*Coded by Christian Martorella      *
*Edge-Security Research             *
*laramies@gmail.com                 *
*************************************

HOST: 1.1.1.10 Username: admin Password file: passlist.txt
===========================================================================
Trying password...
miller


Auth OK ---> Password Found: changeme
tiger

Times -- > Init: 0.032772 End: 6.861204

The SRX was not paying attention during the attack -nothing in the attack table- and the server got owned.

root@NP-vSRX-01> show security idp attack table
root@NP-vSRX-01>

Rolling your own IDP rules

Unfortunately, the server guy had to be let go and we are now looking for a way to be prevent these kinds of attacks. Of course, the first thing to do would be to harden the server and enforce strong credentials, but it does help if you are notified of an ongoing attack. Even better if the SRX could also prevent or slow down the attack.

For our scenario, we are specifically looking for SSH Brute-force attacks. A good way to find out if the SRX has an on-board signature for this is to browse the attack database. You can find all these signatures in operational mode.

root@NP-vSRX-01# run show security idp attack description SSH?
Possible completions:
          Attack name
  SSH:AUDIT:SSH-V1
  SSH:AUDIT:UNEXPECTED-HEADER
  SSH:BRUTE-LOGIN
  SSH:ERROR:COOKIE-MISMATCH
  SSH:ERROR:INVALID-HEADER
  SSH:ERROR:INVALID-PKT-TYPE
  SSH:ERROR:MSG-TOO-LONG
  SSH:ERROR:MSG-TOO-SHORT
  SSH:MISC:EXPLOIT-CMDS-UNIX
  SSH:MISC:MAL-VERSION
  SSH:MISC:UNIX-ID-RESP
  SSH:NON-STD-PORT
  SSH:OPENSSH-MAXSTARTUP-DOS
  SSH:OPENSSH:BLOCK-DOS
  SSH:OPENSSH:GOODTECH-SFTP-BOF
  SSH:OPENSSH:NOVEL-NETWARE
  SSH:OVERFLOW:FREESSHD-KEY-OF
  SSH:OVERFLOW:PUTTY-VER
  SSH:OVERFLOW:SECURECRT-BOF
  SSH:PRAGMAFORT-KEY-OF
  SSH:SYSAX-MULTI-SERVER-DOS
  SSH:SYSAX-SERVER-DOS

root@NP-vSRX-01# run show security idp attack description SSH:BRUTE-LOGIN
Description: This signature detects attempts by remote attackers to log in to an SSH server by brute-forcing the password.

Another great resource for finding the right signature is this page.

If you are using one of the predefined IDP policy sets, navigate down and edit the configuration. I am using Server-Protection-1G on mine.

The policy below will detect SSH brute-force attempts and, by using the ip-action, block the source IP for 30 minutes. That should slow ’em down.

[edit security idp idp-policy Server-Protection-1G rulebase-ips]
root@NP-vSRX-01# show rule SSH-SFTP-Services
match {
    from-zone any;
    source-address any;
    to-zone any;
    destination-address any;
    application default;
    attacks {
        predefined-attacks SSH:BRUTE-LOGIN;
    }
}
then {
    action {
        close-client;
    }
    ip-action {
        ip-block;
        target source-address;
        timeout 1800;
    }
    notification {
        log-attacks;
    }
}

You can rearrange the rules by using the insert policy before

Testing the policy

Running the script for a second time, it only took a small number of brute-force attempts until the alarm triggered. After a minute or two of time-outs, the script gave up and started spawning python errors.

lab@V120-10:/tmp/brutessh$ python brutessh.py -h 1.1.1.10 -u admin -d passlist.txt

*************************************
*SSH Bruteforcer Ver. 0.2           *
*Coded by Christian Martorella      *
*Edge-Security Research             *
*laramies@gmail.com                 *
*************************************

HOST: 1.1.1.10 Username: admin Password file: passlist.txt
===========================================================================
Trying password...
abc123
Exception in thread carmen
:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "brutessh.py", line 44, in run
    t = paramiko.Transport(hostname)
  File "/tmp/brutessh/paramiko/transport.py", line 235, in __init__
    sock.connect((hostname, port))
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 110] Connection timed out
Exception in thread test
:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "brutessh.py", line 44, in run
    t = paramiko.Transport(hostname)
  File "/tmp/brutessh/paramiko/transport.py", line 235, in __init__
    sock.connect((hostname, port))
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 110] Connection timed out
Exception in thread password1
:

Some verification commands on the SRX itself.

root@NP-vSRX-01> show security log | match IDP:
2015-09-17 19:22:45 UTC  IDP: at 1442517764, SIG Attack log <2.2.2.1/6768->1.1.1.10/22> for TCP protocol and service SERVICE_IDP application NONE by rule 6 of rulebase IPS in policy Server-Protection-1G. attack: repeat=0, action=CLOSE_CLIENT, threat-severity=MEDIUM, name=SSH:BRUTE-LOGIN, NAT <0.0.0.0:0->10.0.4.10:0>, time-elapsed=0, inbytes=0, outbytes=0, inpackets=0, outpackets=0, intf:untrust:ge-0/0/0.0->dmz:ge-0/0/2.0, packet-log-id: 0, alert=no, username=N/A, roles=N/A and misc-message -

The SRX detected three login attempts through before shunning the remote IP

root@NP-vSRX-01> show security idp attack table
IDP attack statistics:

  Attack name                                  #Hits
  SSH:BRUTE-LOGIN                              3
root@NP-vSRX-01> show security idp attack detail SSH:BRUTE-LOGIN
Display Name: SSH: Brute Force Login Attempt
Severity: Minor
Category: SSH
Recommended: true
Recommended Action: Drop
Type: signature
Direction: CTS
False Positives: rarely
Service: SSH
Shellcode: no
Flow: control
Context: first-data-packet
Negate: false
TimeBinding:
        Scope: peer
        Count: 10
Hidden Pattern: False
Pattern: \[SSH\].*
root@NP-vSRX-01> show security idp counters action
IDP counters:

  IDP counter type                                                      Value
 None                                                                    0
 Recommended                                                             0
 Ignore                                                                  0
 Diffserv                                                                0
 Drop packet                                                             0
 Drop                                                                    0
 Close                                                                   0
 Close server                                                            0
 Close client                                                            3
 IP action rate limit                                                    0
 IP action drop                                                          3
 IP action close                                                         0
 IP action nofity                                                        0
 IP action failed                                                        0

Wrapping up

Although very basic, this example demonstrated that it absolutely pays off to invest some time in crafting your own IDP rules. Keeping a detailed inventory of the services you are hosting and matching it against the associated application-level attack signatures will greatly increase your security posture. Combined with the Screen Options, which protect against some of the L3/L4 attacks, this makes the SRX well worth considering 🙂