Sunday, August 2, 2020

Generalization of list operations using foldr

Introduction:

List is a generic type or polymorphic type means it can take many shape like List[Int], List[String] etc. And to provide strong type checking, static-type language’s compiler have to know the type of data it will contain. Generally we define it with mentioning the containing type in the definition itself or contained type infered by the compiler itself if it is not mentioned. e.g.

val a:List[Int] = List(1,2,3,4) //defined with contained type

val b = List(1,2,3,4) //defined without contained type

List type also provide many operations which can behave uniformly to many contained types. Example include length, map, filter etc. Each of these operations act on list elements one at a time. Following we will see how these operation can be defined using a more generalized scheme that is provided to us by one other list operation called foldr.

List Operations length, map, filter

list operations usually apply their respective algorithm to each member of list one at a time. And recursively calls the next element till it reaches the end of the list. Following are the definition of some common operations.

def map[A, B](fn: A -> B, ls: List[A]): List[B] = {
        ls match {
                case Nil => Nil
                case h:t => fn(h): map(fn, xs)
        }
}

def filter[A](fn: A -> Boolean, ls: List[A]): List[A] = {
        ls match {
                case Nil => Nil
                case h:t => if (fn(h)) h:filter(fn, t) else filter(fn, t)
        }
}

def length(ls:List[_]): Int = {
        ls match {
                case Nil => 0
                case h:t => 1 + length(t)
        }
}

If we observe these operations closely they uses same common pattern.

  • Recursion scheme : each operation is recursing the list calling itself till it reaches the end of the list.

Following are the variable part in above operations.

  1. Computation and Aggregation: In case non empty list, each operation is applying some function (fn argument in map and filter) to the head of the list. In case of length we can assume a function which ignores its input and return constant 1. After computing we do some kind of aggregation, which helps in assembling the final result. In case of length its add(+). And In case of filter and map its cons(:) operator of list.

  2. Empty list return type.

If we abstract out the above two variable part as the argument. We can further genralise and obtain a more abstract function. It will have the the signature like below

def operateAndAggregateOnList[A, B](compAgg: (A -> B) -> B, zero: B, ls: List[A]): B

But we do not need that as list class already have similar operation called foldr. Following is the definition for foldr

def foldr(fn: (A -> B) -> B, zero: B, ls: List[A]): B = {
        ls match {
                case Nil => zero
                case h:t => fn(A, foldr(fn, zero, tail))
        }
}

As you can see foldr definition looks very similar to the operations map, filter and length except we have abstracted out base case return type and compute and aggregate function. We can further define map, length and filter in terms of foldr.

def map[A, B](fn: A -> B, ls: List[A]): List[B] = foldr((a,b) => fn(a):b, Nil, ls)

def filter[A](fn: A -> Boolean, ls: List[A]): List[A] = foldr((a,b) => if(fn(a)) a:b else b, Nil, ls)

def length(ls:List[A]): Int = foldr((a,b) => 1+b, 0, ls)

concat in terms of foldr

Last I want to show you one another function concat which concat two lists. General scheme is same in this case as well.

def concat[A](firstLs: List[A], secLs: List[A]): List[A] = {
        firstLs match {
                case Nil => secLs
                case h:t => h:concat(t, secLs)
        }
}

Can we define concat in terms of foldr of course we can. In foldr argument teminology zero argument is second list secLs, fn is A → List[B] → List[B]

def concat[A](firtLs: List[A], secLs: List[A]): List[A] = foldr((a,b) => a:b, secLs, firtLs)

conclusion

foldr gives the abstraction over computation and aggregation scheme over a list. As we have seen its a gernealized function and can be used to define other list function which have common list traversal pattern.


Saturday, July 11, 2020

Java Generics Type Bounds


Introduction 

This article is an introduction into Java generic type bounds.

Normal Type

Typically in java normal types are covariant. By that I mean a super type reference can point to sub type object. This allows substitution principle to take its shape in our day-to-day coding. I bet you must have seen this code snippet before

Animal animal = new Bird();

Generic Type

Besides normal types we have special types in Java. These Types take other types as input for their existence e.g. List<T>. These types are called Generic Types. List<T> type needs the T parameter to take shape.Where T is called Type parameter and List is Generic type.
We generally use these generic types to hold normal types and apply generic algorithms through them e.g. sorting of a List, map a Optional value to some other value etc. Other generic type examples are Class<T>, Map<K, V>, Stream<T>, Optional<T>  etc.

The substitution principle(that I mentioned earlier for normal types) in Type parameters needs some tweaking to work with Generic types. This means you will get syntax error if you try to do this.

class Holder<T> {...}

Holder<Animal> holder = new Holder<Bird>();

Though Bird is subtype of Animal, but we cannot substitute Holder<Bird> in Holder<Animal> reference. Next I will introduce type bounds which allows us to have substitution principle to work in type parameters as well.

Type Bounds

Type bounds allows us put some extra type constraints on Type parameters. Through type constraints we can limit the allowed types for Type parameters. There are two flavors of it. They are called Upper Bound and Lower Bound. Next I will explain them one by one.

Upper Bound

In upper bound, type constraint fixates the super type. That means type argument should be bounded by the super type and Type parameter can be any sub class of that super type. We use following syntax for upper bound.

MyGenericClass<T extends SuperType> ref;

Animal in below example is type constraint. The holdRef reference can point to any Holder object whose Type parameter is sub type of Animal e.g.

Holder<T extends Animal> holdRef 

holdRef = new Holder<Bird>
holdRef = new Holder<Dog>
 

Lower Bound

It is very similar to Upper bound but the constraint direction is reversed. In lower bound, type constraint fixates the sub class type and type parameter can be any super class of that sub class type. We use the following syntax for it.

MyGenericClass<T super SubType> ref;

In below example Dog is the type constraint. It limits the type parameter to be any super type of Dog. The holdRef reference can point to any Holder object whose Type parameter is super type of Dog e.g.

Holder<T super Dog> holdRef
holdRef = new Holder<Mammal>
holdRef = new Holder<Animal> 

Conclusion

In java we can use Generic types to represent classes that have very general functionality. This functionality is generally type agnostic. E.g. size of a list, it does not matter if List contains the Integer, String, etc. 

Or functionality that have some type limitation but still can be generalized to large number of types. E.g. sort functionality where type needs to implement Comparable interface through which we can know the relative order between two elements. It does not matter if we pass Integer, String or any user defined class like Employee to sort algorithm as long as type implements the comparable interface.     

Type bounds allows us to put constraint on Type parameter of Genetic types.
Upper bound limits the type parameter to be sub type of a constraint type. Lower bound limits the type parameter to be super type of a constraint type. Both prevents nasty runtime exceptions and save the programmer time and show the error at compilation time. 
    




   

Friday, July 3, 2020

Akka Typed Notes

<draft>
Traditionally Actors have "recieve" method which handles the protocol messages and unhandled messages are taken care by other "unhandled" method. This has a drawback on evolution of protocols. This means that if someone adding a new protocol message then handling of that new message should be done together.Otherwise new message will endup with unhandled messages. Hence its error prone. Typed actor are designed to remove this limitation and designed to be "Protocol first".

Protocol first: means that you do not have to worry about the variety of protocol messages that recieve method can recieve. The intent of Akka Typed is not merely to make sure that messages get declared in a structured way and to ensure that a few unhandled messages aren’t missed.

Actors are defined in term of Behavior, which has two main tasks. Namely process the current message and also indicate what should be behavior for the next message. Second task is done by returning next behavior. If there is no change in behavior it can return itself.

Few implementation differences: Type parameter is invaded to types ActorRef, ActorContext and ActorSystem.There is no implicit sender stored in AbstractBehaviour.
ActorRef is now typed which means actor can specify which type of messages it can receive. 
"Signal" message type signifies the change in the lifecycle of an actor. There are no predefined callback method invoked on actor lifecycle change.

Behaviour Factories: 1.Behaviors.recievemessage[T] gives the template to define the Behaviors.
2.Behaviors.receive[T] is same as recievemessage but additionally provide ActorContext object alongwith message. Context usually allows us to create new child actor.
3.Behaviors.same is returned when we want to have same behavior for next message as well.
 Behaviors.empty,  and Behaviors.ignore are to signify.....????.
4. Behavior.unhandled goes to dead letter inbox and have similar behavior related to unhandled function in classic Akka.
5.To stop an actor, it can return Behavior.stopped after work is done.
6.Best way to start the system is to put initialization code in a gaurdian's behavior. "Behavior.setup" help us in achieve that it takes guardian's behavior and name of the system. It runs the initialization function which will be supplied with context upon receive of first message. And then return the behavior to be applied to first message. The factory function in setup is passed the ActorContext as parameter and that can for example be used for spawning child actors.

7."Behaviors.receiveSignal" is used to receive the signal messages which in turn received by an actor if it is watching some other actor using context.watch method. To observe different stages of lifecycle of an actor (called signal). An actor can register to observe signal by providing PartialFunction[(ActorContext[T], Signal), Behavior[T]] by calling "Behaviors.recieveSignal" method. Signal enum have values like PreRestart, PostStop and Terminated etc.
7.Behavior.withTimers give access to timerSchedular through which an actor can schedule some messages.

StashBuffer allows an actor to temporarily store the messages till actor change its behavior to the messages intended behavior. Too many messages can cause stashoverflow exception so choose capacity carefully.

ActorConext's spawnAnonymous and spawn methods allow to create new child actor. Both methods have overloaded methods which takes additional dispatcher argument.Typed akka dispatcher is only configurable through setting key "akka.actor.default-dispatcher".It also have "watch" method through which an actor can observe other actor life signals or have a death pact with other actor. An alternative to watch is "watchWith", which allows specifying a custom message instead of the Terminated. This is often preferred over using watch and the Terminated signal because additional information can be included in the message that can be used later when receiving it. context provides "log" object. Many of the methods in ActorContext are not thread-safe and
1. Must not be accessed by threads from scala.concurrent.Future callbacks.
2. Must not be shared between several actor instances.
3. Must only be used in the ordinary actor message processing thread.

SpawningProtocol is predefined behavior, defined for creating new actor at each ask or tell call.

ActorSystem's apply method takes the root gaurdian actor behavior and name. When "ActorSystem.terminate" is invoked then the coordinated shutdown process will stop actors and services in a specific order.

Typed akka cluster has a system level actor called "receptionist" which takes the following command 1.find 2.subscribe 3.register. Through receptionist one actor can discover other actor []

MessageAdapter ???

Behavior.validateAsInitial : validate behavior is a legitimate as Behavior.same and Behavior.unhandled should not be initial behavior.


Supervision: An actor can stop itself or get destroyed when its parents dies. A child actor can be forced to stop after it finishes processing its current message by using the stop method of the ActorContext from the parent actor. Only child actors can be stopped in that way. Default supervision strategy of Typed actor is to stop itself in case of any failure or exception. To use different strategy for different kind of exeption use : Behaviors.supervise(<behavior>).onFailure[<ExceptionType>](SupervisorStrategy.Stop|Resume|Restart)


Problem statement : An actor have to speak two protocols.
solution 1: Use Message adapter to convert from one protocol receiver to another. "ActorContext.messageAdapter" method can be used. scala case classes to wrap foriegn protocol messages to actor own message protocol. Message adapter can be implemented using case classes wrapper where new protocol message can be wrapping second protocol message. As case classes apply/unapply function can be levaraged between to and fro between two protocol messages.
Solution 2: Use child actor to speak second protocol.But in this case also the new protocol between child and parent actor is required. Usually representing success/failure/exceptional message. Child is needed to be death watched. Spawning of child and sending it as a receiving actor for second protocol.

Problem statement: Defer the messages till the actor is ready.
Solution: To use stash till actor is ready.

Test a actor: Due to asynchronity of an actor the test can be false negative. Which means that the actor not able to send reply in a arbitrary time limit set by the test but it did send the reply. Instead Typed actor can be tested by checking returned behavior for next message. Since the reply will be synchronus test will become more deterministic.
BehaviorTestkit allows us to test a typed actor. It is a passive actor which means sending a message to behaviorTestkit.ref will not processed immediately.To process the message we should called runOne() method on it explicitly.It also has other method to verify the actor behavior like isAlive() for checking the actor is still alive.TestInbox is another such class representing an actor Ref that can act as sink. It also provide the test context that actor under test used to spawn child actor or invoke death watch etc. This test context will helps in verification of various activities actor supposed to perform.
BehaviorTestKit.getChildKit(parentActorRef) gives the child actor ref which can be further tested using above methodologies.


S<T
called with ActorRef[T]
captured in ActorRef[S]

Statistics Notes

Statistics Measuring center of dataset : 1. Mean is sensitive to outliers its beneficial to use mean if dataset member are close to eac...