You have several key pieces of information and asked an important question about the IPL level. I will assume IPL lvl 1, to make this easy and we can double it later. One of your main drivers will be shard count, and a good rule of thumb is 50 - 100 M objects per shard. Please be aware shard count can not currently be changed after configuration ( It is on the road map, PM can answer about timing and version) So we want to account for future growth also when figuring out the total number of shards. I would use 50 M, since that would let us grow into 100M. So (1,500,000,000/50,000,000 = 30 ). 1.5B/50M = 30 shards then figure out how we want to distribute the shards. With some examples do we want to setup the system with 8 nodes, 5 workers and 3 Masters for an 8 node system or will the masters also hold the index? So if you go with 5 workers it is 6 shards per node, if you go with a 4 node system it is 8 shards per node, I rounded up to 32. I would feel comfortable with the 8 nodes, 5 Shards since that is IPL1, and for IPL2 we will have to double it.
Finding out the exact Data size can be a challenge with out testing and extrapolating the data. I would also want to know the change rate as that may factor into the 50-100M goal. Plus the other piece will be how many users, connections and queries will be hitting the index. As performance can also be impacted their.
Troy