quick sort algorithm improvement if more duplicate

I am reading quick sort algorithm in book by Robert Sedwick Algorithms and data structures part 1-4.

template <class item>

static void quicksort(item [] a, int l, int r)
{
    if(r <= l) return;
    int i = partition(a,l,r);
    quicksort(a, l, i-1);
    quicksort(a, i+1, r);
}

template <class item>
int partition(item a[], int l, int r)
{
    int i = l-1; j = r; item v = a[r];

    for(;;) {
        while( a[++i] < v );
        while( v < a[--j] ) 
            if( j == l ) 
                break;
        if( i >= j) 
            break;  // pointer crossing.
        exch(a[i], a[j]);
    }

    exch(a[i], a[r]);
    return i;
}

Book has following text on above algorithm.

When duplicate keys are present in the file, the pointer crossing is subtle. we could improve the partitioning process slightly by terminating the scans when i < j, and then using j, rather than i-1, to delimit the right end of the left subfile for the first recursive call. Letting the loop iterate one more time in this case is an improvement, because, when ever the scanning loops terminate with j and i referring to the same element, we end up with two elements in their final positions: the element that stopped both scans, which must therefore be equal to the partitioning element, and the partitioning element itself. This change is probably worth making, because, in this particular case, the program leaves a record with a key equal to the partitioning key in a[r], and that makes the first partition in the call quick-sort(a, i+1, r) degenerate, because its right most key is its smallest.

My questions are

How can we modify above program with description below? I am having difficulty in modifying it to understand text.
Why above quick sort algorithm does not work effectively if more duplicate keys are present.
How above modification improve if more duplication keys are present?
What does author mean by "he first partition in the call quick-sort(a, i+1, r) degenerate, because its right most key is its smallest." ? What does author mean by degenerate here?

Thanks for your time and help.

>>Why above quick sort algorithm does not work effectively if more duplicate keys are present?

It becomes inefficient because your breaking condition is: if(i >= j) break;
so, as you scan from both sides with i and j, It is quite possible that you break when i == j instead of letting i surpass over j.

What bad could possibly happen when we break on i==j when many duplicate keys are present ?

When you break for i==j; from first while loop you must have had a[i] >= v and from second while loop a[j] <=v but since we are considering a 'break' for: i==j so, a[i] = a[j] = v i.e. a[i] is same as v, your pivot element.

In such a scenario, your outermost exch(a[i], a[r]); will simply exchange pivot value to itself.
Hence, in your next recursive call quicksort(a, i+1, r); for Right-half of the array, you would have minimum element sitting at the rightmost end.( your pivot choosing strategy is simply, item v = a[r]; ) and we all know it is bad for QuickSort to choose a pivot element which amounts to the minimum or the maximum of the array. Hence, your subsequent recursive call for right-half will be a degenerate one.
That is why author is advising not to break for i==j but catch them just before that happens.

>>What does author mean by degenerate here?

Degenerate here means, the recursion tree is getting skewed i.e. the subsequent problems are not being generated of nearly equal sizes. You are dividing a problem of size N into something like problems of size N-1 and 1 instead of something more balanced, like dividing it into problems of size N/2 and N/2.

>>How can we modify above program with description below?

We could implement it like following:

int partition(int A[], int l, int r){
        int i=l-1, j=r, v = A[r];
        for(;;){
                while(A[++i] < v);
                while(A[--j] > v)
                        if(j == l)
                                break;
                if(i>=j)
                        break;
                swap(A[i], A[j]);
        }
        if(i == j){// case when we stopped at the pivot element.
                j = j+1;//backtrack j 1 step.
                if(j <= r)
                    swap(A[j], A[r]);
                return j;// partition the subsequent problems around j now.
        }
        swap(A[i], A[r]);
        return i;
}

>>How above modification improve if more duplication keys are present?
It improves the performance by letting you NOT generate an obvious scenario of a degenerate case.